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Computer-assisted  testing  applications  have 

proliferated  in  recent  years.     Despite  the  dramatic  growth 

in  the  number  of  tests  that  have  undergone  computer 

conversion,  there  is  little  research  data  to  justify  the 

use  of  computers.     Few  studies  have  been  conducted  wherein 

the  researchers  specifically  examined  the  issue  of  parity 

between  computerized  and  conventional  forms.     Studies  are 

required  to  determine  whether  the  differences  incurred  as  a 

result  of  the  conversion  process  are  significant  enough  to 

jeopardize  the  integrity  of  tests. 

V  „'    The  primary  purpose  of  this  research  was  to  examine  the 
equivalency  issue  in  the  area  of  aptitude  testing. 
Specifically,  the  study  objective  was  to  investigate  whether 
significant  differences  existed  in  mean  aptitude  test  scores 
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when  subjects  were  administered  the  conventional  and 
computerized  adaptive  versions  of  the  Differential  Aptitude 
Tests   (DAT) .     A  secondary  purpose  was  to  determine  the 
degree  that  ancillary  factors   (e.g.,  computing  experience) 
influenced  performance  on  the  computerized  version. 

The  sample  consisted  of  40  high  school  freshman.  The 
statistical  tests  used  to  investigate  the  research 
objectives  were  the  Spearman-Rho  rank  order  correlation, 
analysis  of  variance,  Dunn's  test,  Kolmogorov-Smirnov 
goodness-of-f it  test,  and  linear  regression. 

The  computerized  adaptive  version  of  Form  V  of  the  DAT 
was  found  to  be  equivalent  to  the  conventional  Form  V  of  the 
DAT  for  seven  of  the  eight  subtests.     The  two  versions  of 
the  DAT  had  equivalent  mean  scores,  comparable  shapes  of 
distributions  of  the  scores,  and  similar  rank  order  of 
individual  scores.     The  subtest  for  which  equivalency  was 
not  established  was  the  clerical  speed  and  accuracy  subtest. 

Despite  the  overall  comparability  between  versions,  the 
ancillary  factors  of  computing  experience  and  attitude 
toward  computers  were  found  to  correlate  with  performance  on 
the  computerized  version.     Typing  experience,  however,  was 
not  found  to  correlate  with  computerized  test  performance. 

In  this  experiment  the  suitability  of  assessing  one's 
aptitude  through  computerized  adaptive  testing  was 
confirmed.     The  findings  lead  one  to  conclude,  however,  that 

xi 


while  the  majority  of  the  aptitude  tests  incurred  only  minor 
changes  as  a  result  of  the  conversion  process,  not  all 
conventional  tests  may  be  candidates  for  computer 
conversion. 
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CHAPTER  I 
INTRODUCTION 


In  this  age  of  high  technology  the  computer  and  its 
myriad  of  applications  span  practically  all  aspects  of  life. 
Computers  are  currently  used  inter  alia  to  monitor  patients' 
vital  signs,  diagnose  diseases,  tutor  students  with 
tailor-made  instructions,  send  men  and  women  into  space,  and 
challenge  us  to  thousands  ot  electronic  games.  The 
computer's  capability  to  collect,  store,  analyze,  and 
synthesize  data,  coupled  with  its  ability  to  exchange  and 
process  these  data  with  other  computers  at  microsecond 
rates,  has  created  a  proliferation  of  information.  For 
many,  information  on  a  number  of  subjects  is  immediately 
available  at  their  fingertips.     The  opportunity  to  exchange 
information  quickly  with  others  has  created  a  synergism  that 
has  further  contributed  to  this  information  explosion. 

Some  find  this  burgeoning  wealth  of  data  overwhelming 
and  believe  that  all  this  man-machine  interaction  is  causing 
a  dehumanization  of  society.     Many  others  claim  the  advances 
made  possible  by  computer  technology  are  both  exhilarating 
and  extraordinarily  beneficial  to  society.     Still  others 
believe  that  the  immediate  availability  of  words,  data, 
numbers,  and  images  through  symbolic  technologies  will  allow 
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us  to  aspire  to  new  levels  of  cognition  by  permitting  us  to 
circumvent  low-level  limitations   (e.g.,  computational 
ability)   that  previously  created  impasses  in  cognitive 
processes   (Perkins,  1985) . 

Computer  technology  is  presently  viewed  by  many  as 
playing  a  vital  role  in  both  the  counseling  and  education 
areas.     Although  it  is  impossible  to  delineate  all  the  ways 
computers  are  presently  being  employed,  several  reported 
applications  include  data  collection  and  analysis,  computer- 
assisted  instruction,  computer-assisted  biofeedback, 
computer-assisted  career  guidance,  computer-assisted 
treatment,  and  computer-assisted  assessment  (Sampson, 
1983a). 

One  of  the  most  controversial,  as  well  as  potentially 
promising,  applications  is  computer-assisted  testing. 
During  the  1983  American  Psychological  Society  meeting,  many 
discussions  and  demonstrations  were  conducted  that  proposed 
various  ways  of  assessing  psychological  functions  on  a 
computer   (Hunt  &  Pellegrino,   1984) .     The  Educational  Testing 
Service  has  predicted  that  by  1992  students  will  be 
administered  a  variety  of  tests  using  microcomputers, 
including  the  Scholastic  Aptitude  Test  and  the  Graduate 
Record  Examination   (Biemiller,  1982) . 

While  there  appears  to  be  much  enthusiasm  surrounding 
the  emergence  of  computer-assisted  testing,  many  have  voiced 
misgivings  and  concern  that  the  introduction  of  computer 
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technology  into  the  testing  arena  could  significantly  alter 
the  nature  of  testing   (Burkhead  &  Sampson,   1985) . 
Researchin  which  the  investigator  seeks  to  establish  whether 
or  not  computer-assisted  tests  are,  in  fact,  equivalent  to 
their  conventional  counterparts  is  in  its  infancy.     Only  a 
few  equivalency  studies  have  been  conducted;  the  results  and 
conclusions  from  these  studies  are  mixed   (Honaker  &  Harrell, 
1987) .     Rigorous  investigations  are  essential  in  order  to 
categorize  specific  types  of  tests  that  can  and  that  cannot 
be  administered  with  computer  assistance  without  losses  in 
reliability  and  validity.     Also,  studies  in  which  the 
investigator  seeks  to  establish  the  populations  for  which 
computer-assisted  testing  is  either  appropriate  or 
inappropriate  are  equally  vital  to  the  future  of  the 
assessment  field. 

Background 

Computer-assisted  testing  applications  have 
proliferated  in  recent  years   (Berven,   1984;  Burkhead  & 
Sampson,  1985) .     The  majority  of  the  literature  on  this 
subject  would  lead  one  to  conclude  that  eventually  most,  if 
not  all,  tests  will  ultimately  be  administered  by  the 
computer   (Burkhead  &  Sampson,   1985;  Hunt,   1982;  Sampson, 
1983a;  Weiss,   1983)  .     Psychometricians  agree  that 
computer-assisted  testing  is  considerably  more  efficient 
than  conventional  methods   (Ward,   1984;  Wood,   1984).  Some 
educators  and  counselors  have  contended  that  the  potential 
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tor  adaptive  testing  may  be  one  of  the  greatest  advantages 
of  computer-assisted  assessment   (Henson,  Ross,  &  Harris, 
1977;  Hunt  &  Pellegrino,  1984)  .     Others  have  viewed  the 
economic  advantages  as  being  a  primary  incentive  to  convert 
tests  to  computer  for  administration   (Byers,  1981;  Sampson, 
1983a;  Ward,  1984) .     Advantages,  such  as  flexibility  in 
scheduling  of  tests   (Wood,  1984)   and  rapid  reporting  of 
scores  and  immediate  feedback  for  decision  making  (Sampson, 
1983b) ,  have  also  contributed  to  the  rapid  acceleration  of 
computer-assisted  testing. 

Others,  however,  have  cautioned  that  differences  which 
may  result  from  the  conversion  process  might  significantly 
alter  the  nature  of  tests   (Honaker  &  Harrell,   1987) .  Some 
have  been  specifically  concerned  that  during  the  conversion 
process  specific  test  items  will  be  altered  or  format 
changes  will  result  that  could  influence  overall  performance 
(Duthie,  1984) .     Still  others  have  expressed  concern  that 
computer-assisted  testing  adds  a  new  dimension  to  the 
testing  process   (Burkhead  &  Sampson,  1985) .     For  example,  in 
addition  to  the  intended  traits  or  skills,  one's  computer 
interaction  ability  may  also  be  implicitly  assessed  through 
computerized  tests. 

Computers  are  currently  being  applied  to  a  myriad  of 
assessment  areas,  while  the  validity  of  various  applications 
rests  on  limited  research   (Burkhard  &  Sampson,  1985). 
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Additional  research  to  gain  a  general  understanding  of 
advantages,  disadvantages,  and  suitability  of 
computer-assisted  assessment  systems  appears  to  be 
warranted.     Specifically,  the  need  exists  to  understand  the 
differences  resulting  from  the  computer  conversion  process 
if  appropriate  utilization  of  this  technology  is  to  be 
realized. 

In  view  of  these  concerns,  there  appears  to  be  a 
disproportionate  number  of  tests  that  have  been  converted  to 
computer  administration  in  comparison  to  the  number  of 
parity  studies  that  have  been  reported.       Further  studies  in 
which  investigators  attempt  to  determine  whether 
computerized  methods  of  assessment  are  equivalent  to 
conventional  testing  are  essential  if  the  integrity  of  tests 
is  to  be  preserved.     The  intent  of  this  research  was  to 
examine  the  equivalency  issue  in  the  area  of  aptitude 
testing.     It  was  postulated  that  important  insights 
regarding  the  question  of  equivalency  of  conventional  and 
computer-assisted  methods  would  emerge  by  administering  a 
multiple  aptitude  battery  of  tests  to  a  sample  population 
using  both  methods  and  by  analyzing  the  obtained  data. 

Need  for  the  Study 

Burkhead  and  Sampson   (198  5)   reported  that  the 
development  of  computer  applications  in  the  area  of 
assessment  has  accelerated  in  recent  years.     By  1982,  many 


diagnostic  tests  and  achievement  tests  had  been  computerized 
(Okey  &  McGarity,   1982)  .     Johnson   (1984)    stated  that  by  1984 
approximately  500  clinicians  had  programmed  their  personal 
computers  to  administer  and  score  psychological  tests  and 
that  at  least  another  500  clinicians  had  access  to 
computer-assisted  assessment  systems  developed  by  others. 
In  addition,  several  software  companies  have  developed  a 
variety  of  testing  packages.     Berven   (1984)   anticipated  that 
the  application  of  computer  technology  to  the  testing  area 
would  undergo  rapid  acceleration  in  future  years. 

Despite  the  dramatic  growth  in  the  use  of 
computer-assisted  assessment  systems,  there  is  little 
research  data  to  justify  such  use.     Few  studies  have  been 
conducted  wherein  the  researchers  examined  the  issue  of 
parity  between  the  computerized  versions  and  their 
respective  conventional  forms.     The  equivalency  studies  that 
have  been  conducted  have  generated  mixed  results   (Calvert  & 
Waterfall,   1982;  Honaker  &  Harrell,   1987;  Kingsbury  &  Weiss, 
1981;  Lukin,  Dowd,  Plake,   &  Kraft,   1985;  Wood  &  Strider, 
1980).       For  example,  converted  Minnesota  Multiphasic 
Personality  Inventory   (MMPI)   tests  have  been  reported  by  the 
investigators  to  have  generated  equivalent  results  with  the 
conventionally  administered  version   (White,  Clements,  & 
Fowler,   1985) .     The  converted  Fingertapping  Test,  however, 
was  reported  to  have  generated  significantly 
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different  results  from  those  of  the  conventionally 
administered  version   (Honaker  &  Harrell,  1987) . 

While  Hunt  (1982)   stated  that  the  same  estimate  of 
ability  ought  to  be  obtained  by  both  testing  methods,  others 
have  felt  uncomfortable  with  this  assumption.     Space  (1981) 
cautioned  that  differences  in  administration  methods  may 
affect  test  scores.     Duthie   (1984)  noted  a  series  of 
differences  between  traditional  and  computer-assisted 
testing  that  may  impact  on  test  results.     For  example,  the 
reflection  time  normally  associated  with  conventional  tests 
is  not  always  present  in  rapidly  responding  computer 
systems;  this  difference  may  influence  performance  (Duthie, 
1984)  .     Duthie  suggested  further  that  differences  in  format 
between  the  versions  may  also  contribute  to  differences  in 
overall  performance. 

Hunt  and  Pellegrino   (1984)   concurred  with  these 
assumptions  and  claimed  that  interacting  with  computers  does 
require  a  specialized  ability.     Lansman,  Donaldson,  Hunt, 
and  Yantis   (1982)   investigated  the  correlations  between 
computer-administered  and  conventionally  administered  paper 
and  pencil  tests  and  found  that  the  error  components  of  the 
computer-administered  tests  were  highly  correlated  with  each 
other  but  not  with  the  error  components  of  the 
conventionally  administered  paper  and  pencil  tests.  The 
conclusions,  although  tentative,  are  that  there  are  separate 
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factors  for  item  presentation  mode  and  that  certain  groups 
or  subjects  may  be  selectively  favored  or  disfavored  by 
computer-assisted  testing  (Lansman  et  al.,   1982).  Loftus 
and  Loftus   (1983)  have  also  found  that  computerized  formats 
seem  to  favor  some  groups  of  individuals  more  than  other 
groups.     How  this  might  affect  performance  remains  to  be 
determined. 

Burkhead  and  Sampson   (1985)    suggested  that  the 
mystique  associated  with  computers  may  also  influence 
performance  on  computer-assisted  assessment  systems.  That 
is,  the  lack  of  familiarity  with  computers  may  generate 
anxiety  in  some  individuals  that,  in  turn,  may  influence  the 
response  set  of  the  user.     It  is  believed  that  this  anxiety 
and  uneasiness  with  computers  may  influence  one's  overall 
performance   (Burkhead  &  Sampson,   1985)  .     Hunt  and  Pellegrino 
(1984)   claimed  that,  in  addition  to  lack  of  prior  keyboard 
or  computer  experience,  previous  failures  with  computers 
could  degrade  one's  performance.     It  is  also  conceivable 
that  the  absence  of  rapport  with  a  test  examiner,  which  is 
characteristic  of  a  computer-assisted  testing  environment, 
could  alter  the  performance  of  some  individuals. 

If  the  combined  effects  of  any  of  these  factors  were 
to  create  significant  overall  differences  between  a 
computer-administered  version  and  its  conventional 
counterpart,  then  the  tests  clearly  would  not  measure  in  the 
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same  way.     This  means  that  the  reliability  and  validity  of  a 
particular  computer-converted  test  would  be  suspect.     If  the 
tests  or  testing  conditions  are  significantly  altered  by  the 
conversion  process  so  as  to  alter  test  scores  or  alter  test 
results,  the  tests  cannot  be  considered  reliable  or  valid 
instruments  and  the  existing  normative  data  would  no  longer 
be  applicable.     Because  of  the  vast  numbers  of 
computer-assisted  systems  that  are  currently  available  and 
receiving  wide-scale  usage  and  the  few  equivalency  studies 
that  have  been  conducted  and  documented,  it  appears  likely 
that  at  least  some  computer-assisted  versions  may  be 
yielding  inaccurate  results  or  interpretations.  This 
likelihood  is  further  reinforced  by  the  fact  that  different 
practitioners,  who  possess  varying  degrees  of  computing 
expertise,  are  converting  tests  with  varying  degrees  of 
sophistication  and  skill.     While  some  conversion  processes 
may  be  sound,  others  may  not  be  valid. 

The  questions  that  need  to  be  addressed  relate  to  the 
extent  of  these  differences  between  computer-administered 
and  conventional  test  methods  and  the  situations  in  which 
these  differences  exist.     Not  enough  research  has  been 
conducted  to  ascertain  whether  significant  differences 
generally  exist  and,  if  so,  whether  differences  exist  only 
within  certain  populations  and  whether  they  apply  only  to 
particular  types  of  tests  or  specific  conversion  efforts. 
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It  is  conceivable  that  for  some  individuals   (e.g.,  those  who 
have  had  previous  positive  computing  experience) ,  the 
transfer  to  computer-assisted  testing  could  be  made  without 
difficulty  and  test  results  by  either  method  would  be 
essentially  indentical.     On  the  other  hand,  for  others 
(e.g.,  those  who  are  unfavorably  biased  against  computers), 
the  differences  between  testing  methods  might  produce 
significantly  different  results.     Further,  some  types  of 
tests   (i.e.,  paper  and  pencil  formats)   can  be  converted  from 
conventional  to  computer-assisted  administration  quite 
easily  and  with  only  minor  changes,  while  other  types  of 
tests   (e.g.,  individually  administered  intelligence  tests) 
would  require  more  consequential  alterations  to  accomplish 
such  a  conversion. 

Evidence  of  the  types  of  tests  that  can  be  successfully 
conducted  and  the  individual  traits  and  abilities  that  can 
be  successfully  measured  by  computers  is  essential  in  order 
to  ensure  that  this  new  technology  is  properly  applied  and 
that  test  results  are  consistent  with  those  derived  from 
established  assessment  techniques.     Before  computer-assisted 
testing  can  be  validated  across  all  professional  lines, 
numerous  reliability  studies  must  be  conducted  on  a 
test-by- test  basis  and  a  conversion-by-conversion  basis. 
The  reliability  studies  are  necessary  in  order  to  delineate 
those  tests  that  can  and  those  that  cannot  be  converted  to 


computer  methodologies  without  reductions  in  reliability. 
Specific  reliability  and  validity  attributes  of  the  various 
computer-assisted  assessment  systems  must  be  established 
with  respect  to  various  populations  to  permit  intelligent 
interpretation  of  test  results   (Johnston,  Buescher,  & 
Heppner,  1988) . 

Rationale  for  Multiple  Aptitude  Battery  Selection 
The  multiple  aptitude  battery  was  selected  as  the 
category  of  test  for  this  research  over  other  types  of  tests 
because  of  its  informational  yield  and  the  potential  to 
generalize  the  results.     According  to  Anastasi   (1982)  , 
differential  aptitude  batteries  are  comprised  of  a  wide 
variety  of  tasks  that  allow  for  performance  analyses  related 
to  different  aspects  of  intelligence.     Unlike  the  general 
intelligence  instruments  where  test  items  are  selected  to 
provide  a  unitary  and  internally  consistent  measure,  the 
differential  aptitude  batteries  include  those  subtests  and 
items  that  maximize  intraindividual  variation.  The 
factor-analytic  techniques  used  during  the  construction  of 
the  multiple  aptitude  batteries  allow  the  various  abilities 
loosely  grouped  under  "intelligence"  to  be  systematically 
identified,  sorted,  and  measured   (Anastasi,  1982)  .  Because 
the  differential  aptitude  batteries  generate  a  separate 
score  for  each  identified  ability  in  lieu  of  a  global  score, 
the  opportunity  existed  to  examine  abilities  that  can  and 
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cannot  be  reliably  measured  by  computer-assisted  test 
methods . 

Furthermore,  because  of  the  varying  nature  of  the 
different  tests,  the  degree  of  complexity  associated  with 
the  conversion  of  each  subtest  also  fluctuated.  The 
diversity  of  the  different  tests  in  terms  of  format  and 
structure  provided  the  opportunity  to  extrapolate  these 
results  to  other  computer-assisted  tests. 

Another  rationale  for  selecting  the  multiple  aptitude 
test  for  this  research  is  that  it  affords  the  potential  to 
facilitate  a  variety  of  educational  efforts.     For  example, 
Anastasi   (1982)   stated  that  aptitude  batteries  can  be  used 
to  enhance  the  selection  and  classification  of  individuals 
for  specific  training  programs.     Tolbert   (1980)   claimed  that 
aptitude  batteries  provide  valuable  information  for  planning 
academic  programs  or  choosing  occupations.     Batteries  that 
generate  vocational  profiles  can  indicate  occupational 
directions  the  individual  might  successfully  pursue. 

The  opportunities  to  employ  multiple  aptitude  batteries 
and  realize  the  many  concomitant  benefits  are  limited 
because  of  the  time  and  cost  associated  with  conventional 
means  of  administration  and  scoring.     If,  however,  multiple 
aptitude  batteries  in  particular  and  computer-assisted 
aptitude  testing  methods  in  general  were  determined  to  be 
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both  reliable  and  valid,  time  and  cost  barriers  could  be 
easily  overcome  and  the  many  educational  benefits  of 
aptitude  testing  could  be  realized  on  a  wide-scale  basis. 
Based  on  these  factors  and  the  potential  for  extrapolating 
the  results  to  other  types  of  aptitude  tests,  the  multiple 
aptitude  battery  was  selected  as  an  appropriate  assessment 
tool  for  this  study. 

Purpose  of  the  Study 

The  primary  purpose  of  this  study  was  to  investigate 
whether  significant  differences  existed  in  mean  aptitude 
test  scores  between  tests  employing  computer-assisted  and 
conventional  methods  and  to  establish  the  degree  that  the 
two  testing  methods  correlated.     Subtest  areas  were  examined 
on  an  individual  basis  in  order  to  determine  whether  results 
could  be  generalized  to  all  aptitude  tests  within  the 
battery  or  whether  significant  differences  occured  only  on 
some  specific  tests. 

A  secondary  purpose  was  to  delineate  possible  causes 
for  any  discrepancies  that  were  found.     The  study  examined 
specific  factors,  which  had  been  identified  in  the 
literature  as  potential  causes  for  such  differences  (e.g., 
prior  computing  experience) ,  in  an  effort  to  determine  the 
degree  to  which  they  influenced  the  variability  of 
performance  on  the  computer-assisted  version  of  the  tests. 

Only  after  study  efforts  of  this  kind  allow  for  the 
conclusion  that  there  is  no  significant  difference  between 
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conventional  and  computer-assisted  versions  should 
computer-assisted  testing  be  conducted  on  a  wide-scale  basis 
with  the  attendant  benefits  of  af fordability  and 
availability.     Only  after  studies  of  this  nature  have  been 
conducted  and  their  results  validated  can  the  advantages  of 
computer-assisted  assessment  systems  be  fully  realized. 
Theoretical  Framework  Underlying  the  Problem 

According  to  the  experts  in  the  assessment  field,  one 
of  the  primary  disadvantages  of  computer-assisted  assessment 
systems  is  that  significant  differences  in  tests  may  result 
after  they  have  been  converted  to  computer  (Burkhead  & 
Sampson,  1985;  Duthie ,  1984)  .     Even  minor  changes  that  occur 
as  a  result  of  the  conversion  may  significantly  alter  the 
testing  experience  for  some  segments  of  the  population. 

Learning  theorists  contend  that  the  way  an  individual 
responds  to  any  given  situation  directly  reflects  the 
individual's  learning  history,  and  this  would  apply 
specifically  to  his  or  her  performance  on  a 
computer-assisted  test.     Behavior  theorists  have  specified 
precisely,  in  terms  of  established  learning  principles,  how 
specific  stimuli  in  the  environment  elicit  certain  responses 
(Bower  &  Hilgard,   1981)  .     In  accordance  with  these  concepts, 
one's  performance  on  a  test,  then,  is  more  than  a  measure  of 
one's  ability.     It  is  also  one's  reaction  to  the  totality  of 
stimuli  in  the  specific  environment. 
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Wechsler   (1981)   also  supported  the  viewpoint  that 
performance  on  a  test  is  largely  dependent  upon  one's 
personality  and  one's  individualized  way  of  responding  to 
a  particular  situation.     Specifically,  Wechsler  claimed  that 
test  performance  is  responsive  to  other  factors  besides 
those  included  under  the  auspices  of  cognitive  abilities 
such  as  drives,  needs,  emotions,  incentives,  persistence, 
goal  awareness,  anxiety,  and  other  conative  dispositions. 
It  should  be  noted,  however,  that  while  Wechsler  stressed 
these  nonintellective  factors  as  being  important,  he  also 
claimed  that  "no  amount  of  drive  will  develop  a  dullard  into 
a  mathematician  any  more  than  good  intentions  alone  will 
suffice  to  make  a  person  a  saint"    (Weschler,  1981,  p.  8). 

Behavior  theorists  use  two  principal  schemes  in  order 
to  organize  and  study  the  learning  processes  and  to 
understand  human  behavior  and  to  explain  how  relatively 
minor  differences  in  the  environment  can  sometimes  trigger 
major  differences  in  responses.     The  first  is  Pavlovian,  or 
classical  conditioning,  which  explains  learning  in  terms  of 
a  predictive  contingency  that  is  established  between 
associated  events   (Bower  &  Hilgard,   1981) .     The  second  is 
operant  conditioning  by  which  learning  takes  place  when 
reinforcement  is  made  contingent  upon  a  particular  response 
(Bower  &  Hilgard,   1981)  . 

Pavlovian  conditioning  implies  that  mere  temporal 
contiguity  between  particular  stimuli  can  result  in  their 
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becoming  associated.     For  example,  if  a  specific  negative 
event  coincided  repeatedly  with  a  college  student's  initial 
sessions  on  a  computer  terminal  eventually,  after  repetition 
and  the  correct  time  relationship,  the  computer  testing 
experience  itself  might    evoke  a  negative  reaction  (e.g., 
anxiety)   independent  of  the  original  negative  stimulus. 

Duthie   (1984)  claimed  that  one  of  the  primary  reasons 
that  differences  occur  on  computerized  versions  of  tests 
involves  a  negative  bias  against  computers.     Accordingly,  an 
individual's  discomfort  or  anxiety  associated  with  computers 
in  general  could  influence  that  individual's  performance  on 
computer-assisted  tests.     There  are  now  a  large  number  of 
experiments  and  theoretical  treatments  that  have 
demonstrated  that  anxiety,   fear,  and  various  other  emotions 
can  be  acquired  through  Pavlovian  conditioning  (Bolles, 
1969;  MacKintosh,   1974).     In  fact,  there  is  now  significant 
evidence  that  allows  one  to  conclude  that  Pavlovian 
conditioning  can  lead  to  emotional  learning. 

There  is  equally  strong  evidence  that  suggests  that  the 
ability  of  a  stimulus  to  elicit  a  response  such  as  anxiety 
increases  as  the  stimulus  becomes  more  similar  to  the  one 
used  during  initial  conditioning   (Bower  &  Hilgard,   1981) . 
The  theory  that  supports  this  precept  is  called  "stimulus 
generalization".     Applying  the  stimulus  generalization 
theory  to  the  college  student  in  the  example  above  would 


17 

proceed  as  follows:  If  the  student  were  to  perceive  an 
upcoming  computer-assisted  testing  experience  to  be  very 
similar  to  the  previously  painful  testing  experiences, 
the  anticipation  and  anxiety  would  be  greater  than  if  the 
situation  were  viewed  as  being  somewhat  less  similar. 

The  feelings  of  anxiety  and  discomfort  associated  with 
computers  generally  can  be  further  complicated  if  the 
student  in  question  senses  considerable  pressure  to  excel 
and  at  the  same  time  feels  little  or  no  hope  of  being  able 
to  do  so.     According  to  Bower  and  Hilgard  (1981)  ,  a  sense  of 
helplessness  can  lead  to  changes  in  emotionality  and 
impairments  in  capability.     In  such  cases,  where  there  is  no 
apparent  control  over  a  painful  stimuli  or  situation, 
helplessness  is  perceived  and  cognitive  deficits  result 
(Bower  &  Hilgard,   1981) . 

In  accordance  with  operant  conditioning,  the  second 
primary  scheme  used  to  study  learning,  when  a  response  that 
is  emitted  is  instrumental  in  achieving  a  goal  or  satisfying 
a  need,  the  probability  of  repeating  that  response  is 
increased.     For  example,   if  a  student  had  a  history  of  being 
rewarded  by  a  teacher  or  an  examiner  upon  completion  of  a 
successful  test,  he  or  she  would  learn  that  favorable 
performance  had  advantages  that  may  in  turn  generate 
incentive  and  perseverance  to  excel  on  subsequent  tests. 


18 

Sheffield  (1966)   developed  the  proposition  that 
reinforcing  stimuli  operate  as  they  do  because  they  act  as 
incentives.     That  is,  they  are  events  that  increase  drives. 
An  individual  who  has  received  a  substantial  amount  of 
positive  reinforcement  from  others  may  have  significantly 
more  incentive  to  perform  on  a  conventionally  administered 
test  than  on  a  computerized  test,  where  such  reinforcements 
would  not  be  obtainable.     The  absence  of  a  human  examiner 
was  identified  by  Burkhead  and  Sampson   (1985)   as  being  a 
possible  cause  of  differences  in  scores  between  methods  for 
some  individuals. 

In  addition  to  the  myriad  of  learned  associations  and 
reinforced  behaviors  that  influence  an  individual's  overall 
ability  to  perform  in  a  testing  situation   (or  any 
situation) ,  learning  theorists  generally  contend  that  mode 
of  presentation  can  also  impact  on  performance.  For 
example,  a  computer-administered  test,  which  relies  on 
visual  presentations,  might  produce  vastly  different  results 
from  the  same  type  of  test  when  conducted  by  an  examiner  who 
presents  the  majority  of  test  items  orally.  Learning 
theorists  have  determined  that  each  sensory  modality  has  its 
own  memory  system  and  capacity  for  handling  information. 
Typically,  an  individual's  iconic  memory  and  echoic  memory 
process  and  store  data  differently   (Wallace,  Goldstein,  & 
Nathan,  1987)  .     Phonological  coding  followed  by  rehearsal 
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may  be  the  optimal  strategy  for  certain  individuals,  while 
visual  coding  and  rehearsal  may  be  easier  for  others  (Bower 
&  Hilgard,   1981) . 

Wickelgren   (1979)   concluded  from  his  research  on 
primary  memory  that  short-term  memory  is  especially 
sensitive  to  similarities  in  sound  or  pronunciation.  When 
similarity  was  high,  there  was  much  less  retention  than  when 
it  was  low.     For  example,  if  an  examinee  were  asked  to 
repeat  a  series  of  digits  that  sounded  very  similar  to  sets 
of  digits  presented  previously,  interference  was  likely  to 
result  and  impair  performance.     The  same  level  of 
interference  would  not  be  expected  in  the  computer  version 
of  this  type  of  test  since  digits  would  typically  be 
presented  visually  instead  of  orally  and,  therefore,  would 
not  be  subject  to  the  same  sensitivity. 

In  addition  to  differences  in  interference  levels  that 
may  occur  as  a  result  of  oral  presentations  vis-a-vis  visual 
presentations,  variations  in  listening  and  reading 
comprehension  may  also  result  in  overall  differences  in 
performance   (Johnston  et  al.,   1988).     While  the  normal 
population  may  exhibit  relatively  small  differences  between 
written  comprehension  and  listening  comprehension, 
significant  differences  are  noted  for  certain  populations. 
For  example,  individuals  suffering  from  dyslexia  or 
inadequate  reading  skills  will  exhibit  significantly  less 
comprehension  from  written  material  than  from  spoken 
material . 
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Group-administered  paper  and  pencil  tests  are  altered 
by  changes  in  mode  of  presentation  in  another  way. 
Specifically,  the  opportunity  to  review  and  reflect  upon 
previous  test  items  is  not  available  on  the  standard 
computer-administered  versions.     Retrieval  cues,  which  are 
pieces  of  information  that  are  used  to  help  locate  other 
related  items  of  information  in  memory,  are  often  embedded 
in  previous  test  questions   (Wallace  et  al.,  1987).  By 
facilitating  the  activation  of  memory  nodes,  the  employment 
of  retrieval  cues  can  result  in  a  more  efficient  search  of 
long-term  memory.     It  would  then  follow  that  one's  ability 
to  retrieve  test  information  from  long-term  memory  appears 
to  be  greater  when  reflection  upon  previous  test  items  and 
the  identification  of  retrieval  cues  are  permissible. 
Duthie   (198  4)   cited  the  lack  of  reflection  time  on 
computer-administered  tests  to  be  a  potential  cause  of 
difference  in  scores  between  methods  of  administration. 

These  are  only  a  few  reasons  why  learning  theorists 
would  suggest  that  differences  in  test  methods  will  result 
in  differences  in  performance.     The  intent  of  the 
research  was  to  examine  this  assertion  and  to  identify  any 
factors,  which  either  separately  or  collectively,  resulted 
in  overall  significant  differences  between  conventional  and 
computer-assisted  test  methods. 
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Definition  of  Terms 

Advanced  achievers  is  a  category  of  students  whose 
performance  on  a  standardized  achievement  test  falls  in  the 
7th,  8th,  and  9th  stanine  levels. 

Average  achievers  is  a  category  of  students  whose 
performance  on  the  standardized  achievement  test  falls  in 
the  4th,  5th,  and  6th  stanine  levels. 

Basic  level  achievers  is  a  category  of  students  whose 
performance  on  a  standardized  achievement  test  falls  in  the 
3rd  stanine  level  or  below. 

Computer-assisted  test  is  a  psychometric  assessment 
instrument  that  is  administered  with  the  assistance  of  a 
microcomputer.     Test  items  are  generally  presented  on  a 
computer  terminal  screen  and  answers  are  entered  into  the 
computer  by  means  of  a  keyboard  or  special  input  device  such 
as  a  joystick,   light  pen,  or  mouse. 

Conventionally  administered  test  is  a  psychometric 
assessment  instrument  that  is  administered  exclusively  by  a 
human  examiner.       Test  items  are  generally  presented 
verbally  by  the  examiner  to  one  examinee  at  a  time  or  are 
administered  in  printed  form  to  many  examinees 
simultaneously. 

Multiple  aptitude  battery  is  an  assessment  instrument 
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designed  to  sample  a  wide  variety  of  cognitive  functions  and 
provide  an  intellectual  profile  showing  the  individual's 
characteristic  strengths  and  weaknesses   (Anastasi,   1982)  . 

Self-administered  questionnaire  is  a  self-administered 
instrument  that  is  used  to  collect  data.     Structured  items, 
which  provide  a  fixed  number  of  alternative  ways  of 
responding  to  the  questions,  are  used  in  order  to  elicit 
precise  answers   (Issac  &  Michael,  1981). 

1 

Delimitations 

The  following  are  the  delirainations  that  apply  to  this 
study: 

1.  The  multiple  aptitude  battery  that  was  used  for  this 
research  was  the  Differential  Aptitude  Tests   (DAT) .  It 
should  be  noted,  however,  that  some  classes  of  tests 
included  in  the  DAT  are  common  to  other  aptitude  batteries. 
For  example,  tests  to  assess  numerical,  verbal,  clerical, 
and  spatial  abilities  are  contained  in  both  the  DAT  and  the 
General  Aptitude  Test  Battery   (GATE)    (Anastasi,  1982). 

2.  The  subjects  for  the  test  were  volunteers  from  the 
freshman  class  of  a  county-wide  school  district.     Two  of  the 
three  academic  grouping  categories  were  represented  in  the 
sample  population. 
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3.  The  conventionally  administered  version  of  the  test 
was  administered  by  a  licensed  mental  health  counselor.  The 
computer-assisted  version  was  presented  on  a  microcomputer 
provided  by  the  school.     Each  student  was  administered  both 
Forms  V  and  W  of  the  test.     Half  of  the  students  were 
administered  the  conventional  version  of  Form  V,  while  the 
other  half  of  the  students  took  the  computerized  adaptive 
version  of  Form  V.     All  students  were  administered  the 
conventional  version  of  Form  W. 

4.  The  self-administered  questionnaire,  which  was 
developed  for  the  study,  was  given  to  each  participant  in 
the  experimental  group.     The  items  in  the  instrument  were 
intended  to  obtain  information  concerning  the  participant's 
computer  and  keyboard  experience  and  his  or  her  attitude 
toward  computers   (see  Appendix  B) . 

Limitations 

The  limitations  of  this  study  are  cited  below: 

1.  The  results  pertain  only  to  the  DAT  and  cannot  be 
generalized  to  other  multiple  aptitude  tests.  Inferences 
may  be  drawn,  however,  regarding  computer  administration  of 
classes  of  tests  that  are  common  to  other  types  of  aptitude 
scales . 

2.  The  results  are  not  necessarily  generalizable  to 
the  total  Orange  County  public  high  school  population  or  to 
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students  from  other  high  schools  since  the  participants  were 
volunteers  from  a  single  county. 

3.  The  results  are  not  generalizable  to  other  age 
group  categories. 

4.  The  conventionally  administered  version  was  given  by 
only  one  licensed  mental  health  counselor.       As  a  result, 
the  possibility  exists  that  some  bias,  with  respect  to 
national  norms,  could  have  been  introduced  into  the  results. 

5.  The  responses  obtained  from  the  computer- 
administered  version  were  dependent  upon  the  computer- screen 
representation  of  the  cognitive  function  being  tested.  The 
results  are  not  necessarily  generalizable  to  other 
computer-assisted  tests. 

6.  The  utility  of  the  self-administered  questionnaire 
was  dependent  upon  the  validity  of  the  respondent's  answers 
and  his  or  her  attitudes  at  the  time. 

Organization  of  the  Study 
The  intent  of  Chapter  I  has  been  to  provide  essential 
background  and  define  both  the  purpose  and  the  need  for  the 
study.     In  Chapter  II,  the  related  literature  is  reviewed 
in  order  to  provide  the  reader  with  a  broader  perspective  of 
the  subject  area.     In  Chapter  III,  the  methodology  and  its 
components  are  described.     In  Chapter  IV,  the  results  of  the 
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study  are  presented.     Finally,  in  Chapter  V,  the 
conclusions,  implications,  and  recommendations  that  flow 
from  the  study  are  offered. 


CHAPTER  II 
REVIEW  OF  RELATED  LITERATURE 

Computer  technology — is  it  a  viable  tool  to  be  used  in 
the  counseling  and  education  fields  and,  specifically,  in 
the  area  of  assessment?    Essentially  that  is  the  problem 
addressed  by  this  research.     In  order  to  have  sufficient 
information  to  investigate  this  problem  and  to  formulate 
conclusions,  it  was  necessary  to  peruse  the  literature  and 
to  analyze  statistics  made  available  from  previous  research 
related  to  specific  applications  of  computer  technology  in 
these  fields.     Related  literature  can  be  classified  under 
one  of  the  following  four  major  areas:     a)  computer- 
assisted  applications  in  counseling  and  education;  b)  pros 
and  cons  of  computer-assisted  assessment  systems;  .  ' 

c)  impact  of  computer  technology  on  future  assessment;  and 

d)  research  data  on  computer-assisted  assessment  systems. 

Each  of  these  four  areas  are  individually  discussed  in 
the  following  sections  of  this  chapter.     The  intent  is  to 
provide  the  reader  with  a  broad  perspective  of  past  and 
ongoing  research  in  this  field  as  well  as  a  detailed  picture 
of  the  current  research  problem.     The  review  of  the  related 
literature  has  been  organized  so  as  to  begin  with  the  most 
general  topic  and  conclude  with  the  most  specific.  The 
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final  section  contains  an  outline  of  those  areas  of  the 
specific  research  problem  that  require  additional 
investigation . 

Computer-Assisted  Applications  in  Counseling  and  Education 
There  are  several  different  categories  of  computer 
applications  in  counseling  and  education  reported  in  the 
literature.     One  of  the  first,  and  still  the  most  widely 
accepted,  involves  using  computers  as  an  aid  in  data 
processing.     Data  collection,  clerical  tasks,  repetitive 
functions,  and  statistical  computations  are  examples  of  the 
functions  that  fall  into  the  data  processing  category. 
Professionals  almost  unanimously  agree  that  computerized 
record  keeping  is  not  only  efficient  and  cost-effective,  but 
also  frees  the  counselor  to  spend  more  time  counseling 
(Burkhead  &  Sampson,  1985) .     However,  according  to  Space, 
Murphy,  and  Shelton   (1980) ,  records  maintained  on  computer 
systems  have  a  much  greater  risk  of  illegal  access  than 
conventional  records  primarily  because  of  remote  terminal 
access.     Because  large  amounts  of  data  can  be  electronically 
stored  for  long  periods  of  time  and  at  minimal  costs,  many 
store  much,  if  not  all,  of  their  client  data  in  computer 
memory.       The  inability  to  guarantee  the  safeguarding  of 
confidential  records  is  considered  a  major  limitation  of 
these  systems   (Sampson,  1983a) . 
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An  application  that  has  also  received  wide-scale  usage 
is  computer-assisted  instruction   (CAI) .     CAI  systems  are 
used  to  provide  instruction  for  a  myriad  of  educational 
purposes  as  well  as  to  assist  therapeutic  applications,  such 
as  training  in  assertiveness ,  parenting  skills,  social 
skills,  appropriate  handling  of  anger,  and  stress  management 
(Sampson,  1983a) .     Many  of  the  more  sophisticated  CAI 
programs  have  some  limited  ability  to  assess  the  user's 
learning  style  and  then  adapt  the  presentation  of 
instruction  accordingly.     Burger   (1985)   contended  that  when 
teaching  and  learning  styles  are  matched,  learning 
efficiency  increases  significantly.     In  addition,  many  have 
reported  that  CAI  systems  have  a  positive  effect  on  learner 
attitude  and  self-esteem   (Cambre,   1984;  Dalton,  1986;  Ebner, 
Danaher,  Mahoney,  Lippert,   &  Balson,   1984)  . 

Employment  of  interactive  video  discs   (IVD)  enhances 
CAI  interactive  capabilities.     Interactive  video,  which  can 
be  stored  on  tape  or  disc,  includes  the  features  of 
automatic  recall  and  stop,  dual  audio  tracks,  speed  control, 
and  freeze-frame  capabilities.     While  videotape  has  the 
advantage  of  being  less  expensive,  it  is  slow  in  retrieving 
information.     A  video  disc,  on  the  other  hand,  provides 
within  2  seconds  access  to  any  one  of  10  billion  bits  of 
information  on  one  side  of  a  disc.     Fifty-four  thousand 
video  still  frames  or  30  minutes  of  motion  picture  can  be 
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recorded  on  one  side  of  a  disc.     IVD  provides  for  graphic  or 
print  overlays  on  video  and  immediate,  appropriate  feedback 
and  reinforcement.     Input  can  be  entered  through  a  keyboard, 
mouse,   light  pen,  bar  code,  or  touch  screen  (Baur,  Miller,  & 
Henry,   1984) .     Its  primary  strength  is  its  interactivity. 
Not  only  does  the  user  interact  with  the  medium,  but  also 
the  medium  promotes  interaction  among  users   (Deshler  &  Gay, 
1986) .     Williams  and  Harold   (1985)   found  that  students  who 
received  classroom  instruction  augmented  with  IVD  performed 
significantly  better  on  tests  than  those  who  received  only 
classroom  instruction. 

According  to  Sampson   (1983b) ,  one  of  the  most  widely 
used  computer  applications  in  the  counseling  and  education 
fields  is  computer-assisted  career  guidance   (CACG) . 
Typically,  the  objectives  of  career  guidance  systems  are  to 
assist  the  user  in  identifying  appropriate  occupational 
alternatives,  to  provide  assistance  in  predicting  future 
success,  and  to  formulate  long-range  career  plans. 

The  development  and  use  of  computers  for  career 
guidance  accelerated  in  the  late  1970s   (Tolbert,   1980) . 
Even  the  initial  systems  received  positive  evaluations  from 
a  number  of  researchers   (Space  et  al. ,  1980) .     According  to 
Tolbert   (1980) ,  CACG  systems  facilitate  the 
user's  exploration  of  both  educational  and  occupational 
options.     Ryan,  Drummond,  and  Shannon   (1980)   report  that 
computer-assisted  career  guidance  systems  have  been  shown  to 
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improve  the  overall  quality  of  career  guidance.  Sampson 
(1983a)  concluded  that  computer-assisted  guidance  systems 
are  most  effective  in  clarifying  values,  formulating  career 
goals,  and  encouraging  career  awareness  and  information- 
seeking  activities. 

While  the  majority  of  the  research  seems  to  support  the 
idea  that  computer-assisted  career  guidance  systems 
generally  provide  relevant  information  about  educational  and 
occupational  alternatives,  there  appear  to  be  uncertainties 
that  warrant  further  investigation.     Johnston  et  al.  (1988) 
cautioned  that  although  these  systems  are  now  commonplace, 
they  have  received  little  psychometric  scrutiny;  their 
validity  and  reliability  need  to  be  systematically  examined. 
Sampson   (1983a)   suggested  that  because  considerable 
differences  in  designs  exist,  one  cannot  conclude  without 
additional  research  that  all  systems  are,  in  fact, 
effective.     Johnston  et  al.    (1988)  have  stated  that  these 
systems  have  not  been  accepted  by  all  users  and  additional 
studies  are  required  that  seek  to  identify  individuals, 
populations,  and  settings  for  which  suitability  exists. 

Cairo   (1983)   found  that  there  is  a  correlation 
between  the  amount  of  time  on  the  computer  and  the  magnitude 
of  user  gain  from  these  systems  and  recommended  more 
investigations  in  this  area.     White   (1987)  substantiated 
Cairo's   (1983)   contention  by  stating  that  maximum 
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effectiveness  with  a  system  is  achieved  only  when  users  are 
sufficiently  oriented  to  the  system.     Additional  studies 
appear  to  be  required  for  these  systems  to  be  optimally 
employed. 

Another  computer  application  that  has  recently  emerged 
in  the  counseling  setting  involves  the  area  of  computer- 
assisted  treatment.     One  example  of  this  is  the  growing 
number  of  computer-assisted  applications  that  have  been 
developed  in  the  cognitive  rehabilitation  area  (Honaker  & 
Harrell,  1987).     Life  Science  Associates  has  developed  a 
series  of  computer  programs  specifically  designed  to  assess 
and  treat  clients  with  cognitive  deficits,  such  as  learning 
disabilities,  head  trauma,  and  cerebrovascular  stroke.  Dr. 
William  Lynch  at  the  Palo  Alto  Veterans  Administration 
Medical  Center  has  found  Atari  video  games  to  be  useful  in 
the  retraining  of  those  skills  often  found  impaired  in 
brain-injured  patients.     For  example,  the  game  "Sabotage" 
has  been  shown  to  develop  complex  coordination  and  tracking 
skills  and  "Asteroids"  has  been  demonstrated  to  improve 
visual  memory  skills   (Honaker  &  Harrell,  1987) .  Preliminary 
studies  also  suggest  that  specifically  designed  computer 
games  may  be  an  effective  way  to  teach  impulse  control  as 
well  as  problem-solving  techniques    (Honaker  &  Harrell, 
1987) .     The  success  of  these  initial  systems  has  led  to 
ongoing  expansion  and  development  of  subsequent  software 
with  particular  focus  on  detection  and  treatment  (Schmitt 
&  Growick,   1985) . 
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Another  relatively  new  computer-assisted  treatment 
application  is  based  on  the  behavioral  method  of  systematic 
desensitization.     The  method  of  systematic  desensitization, 
which  was  developed  by  Wolpe  in  1958  for  the  alleviation  of 
maladaptive  anxiety,   follows  the  principles  of  counter- 
conditioning  and  has  been  shown  empirically  to  be  an 
extremely  effective  treatment  for  phobic  disorders   (Rimm  & 
Masters,  1979) .     Systematic  desensitization  appears  to  be 
well  adapted  to  computerization  since  it  follows  a  logical 
and  structured  sequence   (Biglan,  Villwock,  &  Wick,  1979) . 
Computer-assisted  instruction  enhanced  with  video  scenes 
could  conceivably  be  used  to  induce  a  state  of  relaxation, 
while  biofeedback  equipment  could  signal  to  the  computer 
that  a  state  of  relaxation  has  been  achieved.     It  is  now 
even  technically  feasible  to  design  a  computer-assisted 
systematic  desensitization  system  that  allows  the  client  to 
establish  his  or  her  own  hierarchy  of  anxiety-provoking 
stimuli  for  a  wide  variety  of  maladaptive  anxieties  (Farr, 
1984)  .     For  example,  an  interactive  video  system  can  be  used 
to  establish  a  hierarchy  of  fear-producing  or 
anxiety-producing  stimuli  by  displaying  scenes  in  the 
sequence  that  was  established  by  the  individual's  responses. 
Large  visual  display  screens  and  sound  systems  could  also  be 
employed  to  enhance  the  realism  and  detail  of  the  scenes  in 
the  hierarchy  and  facilitate  the  systematic  desensitization 
process. 
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Computer-assisted  biofeedback  is  another  treatment- 
oriented  computer  application  presently  employed  in  the 
counseling  field.     Biofeedback  is  a  technique  specifically 
designed  to  train  the  individual  to  regulate  consciously  his 
bodily  functions,  such  as  heartbeat  or  blood  pressure. 
These  functions,  which  were  earlier  pres\imed  by  most  to  be 
involuntary  in  nature,  can  be  consciously  regulated  after 
undergoing  training  in  biofeedback  procedures.  Problems, 
such  as  headaches,  hypertension,  ulcers,  asthma,  and 
anxiety,  can  also  be  alleviated  with  computer-assisted 
biofeedback  devices  and  procedures   (Sampson,  1983a) . 

According  to  Sampson   (1983a) ,  computer-assisted 
biofeedback  has  several  advantages.     First,  standardization 
of  procedures  can  be  maintained  because  structured 
biofeedback  protocols  can  be  created.     A  second  advantage 
involves  the  ease  of  operating  the  system  as  well  as  its 
accuracy  and  efficiency.     Since  biofeedback  data  are  both 
automatically  stored  and  retained,  human  error  in  data 
collection  is  eliminated.     Also,  because  the  biofeedback 
system  is  adaptive,  the  computer  can  modify  the  biofeedback 
protocol  to  meet  the  needs  of  the  clients  as  they  change. 
Another  advantage  of  the  computer-assisted  biofeedback 
system  is  its  ability  to  determine  and  analyze  a  subject's 
complex  pattern  of  physiological  responses  from  a  variety  of 
different  sources  simultaneously.     Despite  the  number  of 
potential  advantages,  Sampson  (1983a)   claimed  that  the 
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variability  among  instrumentations,  feedback  protocols,  and 
methods  of  collecting  data  result  in  a  bewildering 
combination  of  factors  that  limit  research  and  require 
further  development. 

The  final  computer  application  to  be  discussed  involves 
assessment  and,  according  to  Sampson   (1987) ,  is  best  termed 
computer-assisted  testing  or  computer-aided  testing.  Since 
computer-assisted  testing  is  the  primary  focus  of  this 
paper,  it  will  be  addressed  in  somewhat  greater  detail  than 
the  aforementioned  applications. 

The  area  of  computer-assisted  testing  has  undergone 
rapid  expansion  in  the  recent  past  and  its  applications  are 
currently  receiving  wide-scale  usage  in  the  counseling  and 
education  fields.     Sampson   (1983b)  divided  computer-assisted 
assessment  systems  into  several  sub-systems.     The  first  type 
of  assessment  system,  which  has  been  in  use  for  the  longest 
period  of  time,  involves  scoring,  profiling  and  interpreting 
standardized  paper  and  pencil  tests.     Optical  scanners  and 
character  readers  are  the  devices  typically  employed  to 
process  the  data.     Interest  inventories,  as  well  as  aptitude 
and  achievement  batteries,  are  examples  of  tests  that  are 
currently  scored  and  interpreted  by  this  type  of  system. 

A  second  type  of  assessment  system  outlined  by  Sampson 
(1983b)  involves  the  use  of  computers  to  administer,  score, 
and  interpret  results.     Various  personality  tests,  such  as 
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the  MMPI,  are  examples  of  tests  that  can  be  administered  in 
this  way   (White  et  al. ,  1985). 

The  third  type  of  assessment  system  also  uses  computers 
for  test  administration  and  scoring.     However,  this  type  of 
assessment  system  provides  a  generalized  interpretation  of 
results  via  computer  controlled  audiovisual  equipment. 

The  last  type  of  assessment  system  consists  of 
interactive  assessment  systems   (Sampson,   1983b) . 
Interactive  systems  are  said  to  be  adaptive  in  the  sense 
that  they  are  programmed  to  adapt  to  the  examinee ' s 
demonstrated  ability  level.     In  other  words,  the  difficulty 
level  of  the  test  items  can  be  adjusted  to  the  examinee's 
capabilities . 

Interactive  tests  are  developed  using  a  wide  variety  of 
procedural  models.     For  example,  a  simple  model  might 
involve  only  two  stages  of  testing.     The  first  stage  might 
require  all  examinees  to  take  a  "routing"  test,  which  would 
consist  of  a  variety  of  items  at  disparate  levels  of 
difficulty.     Depending  upon  ones  performance  on  the  routing 
test,  the  individual  would  be  administered  a  subsequent  test 
that  was  at  a  level  commensurate  with  his  or  her 
demonstrated  ability. 

Another  adaptive  model  might  begin  with  a  test  item  of 
moderate  difficulty.     If  an  examinees'  response  to  this 
question  is  correct,  he  or  she  would  be  routed  upward  to  the 
next  item  of  greater  difficulty;  if  the  response  is 
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incorrect  the  individual  is  routed  downward  to  the  item  of 
slightly  less  difficulty.     This  format  is  followed  until  the 
individual  has  answered  a  specific  sequence  or  number  of 
items   (Anastasi,  1982) . 

More  complex  adaptive  test  development  has  employed 
item  response  theory   (IRT) .     IRT  permits  tailoring  test 
items  to  the  demonstrated  models  that  serve  as  the  basis  for 
adaptive  systems.     Three  of  the  leading  models  used  in  test 
development  are  the  3-parameter  logistic  model,  the 
3-parameter  normal  ogive  model,  and  the  1-parameter  logistic 
model.     McKinley  and  Reckase   (1984)   reported  that  the 
1-parameter  logistic  model,  known  as  the  Rasch  model,  has 
been  particularly  successful  in  the  adaptive  test 
development  area.     The  Rasch  method  uses  the  LOGIST  computer 
program  to  establish  the  parameter  estimates  using  response 
data  from  a  large  population  of  examinees.     The  model 
employs  maximum  likelihood  ability  estimation  as  well  as 
maximum  information  item  selection  procedures.     Testing  is 
continued  until  all  items  have  been  presented  that  yield  an 
item  information  value  for  the  most  recent  estimate  of 
ability.     The  examinees'  ability  estimate  is  increased  by  a 
fixed  amount  for  each  correct  response  and  decreased  by  a 
fixed  amount  for  each  incorrect  response   (McKinley  & 
Reckase ,   1984) . 

Interactive  assessment  systems  are  considered  by  many 
to  offer  advantages  over  conventional  fixed  tests  for  a 
variety  of  reasons   (Hunt  &  Pellegrino,  1984).     According  to 
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Biemiller   (1982) ,  interactive  assessment  systems  are 
extremely  efficient  since  they  are  designed  to  select 
subsequent  questions  based  on  all  previous  responses  and, 
thereby,  eliminate  those  questions  that  are  either  well 
below  or  substantially  above  the  performance  range  of  the 
examinee.     Interactive  assessment  systems  are  receiving  much 
attention  because  of  the  many  unique  advantages  associated 
with  them. 

In  summary,  computer-assisted  systems  are  currently 
receiving  wide  scale  usage  in  the  therapeutic  setting. 
There  are  several  different  categories  of  computer 
applications  in  counseling  and  many  systems  and  sub-systems 
within  each  category.     While  the  rate  of  new  applications  is 
growing  rapidly,  the  amount  of  research  regarding  their  use 
has  not  kept  pace.     In  order  to  determine  adequately  which 
systems  are  effective  and  which  are  not  and  to  establish 
which  applications  hold  the  most  promise  and  for  whom,  more 
research  is  required   (Cairo,  1983;  Honaker  &  Harrell,  1987; 
Lawrence,  1986;  Sampson  &  Burkhard,  1985). 

Pros  and  Cons  of  Computer-Assisted  Assessment  Systems 
Because  computer-assisted  assessment  systems  are  a 
relatively  new  addition  to  the  testing  arena,  a  wealth  of 
substantive  research  data  regarding  their  usefulness  is 
not  available.     Much  controversy  regarding  their 
effectiveness  and  suitability  still  exists.     While  many 
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experts  have  extolled  the  numerous  advantages  of  employing 
computer-assisted  assessment  systems,  others  have  voiced 
misgivings  regarding  their  use.     The  more  commonly  reported 
advantages  and  disadvantages  of  computer-assisted  assessment 
systems  will  be  delineated  in  this  section  of  the  paper. 

There  are  several  advantages  to  using  computers  in 
testing  that  have  caused  many  practitioners  to  gravitate  to 
computer-assisted  assessment  systems.     One  of  the  most 
frequently  reported  reasons  is  the  cost  advantage. 
Microcomputers,  used  as  automated  testing  stations,  have 
been  cited  as  being  significantly  more  cost-effective  than 
traditional  testing  methods   (Hunt  &  Pellegrino,  1984; 
Sampson,   1983b;  Ward,   1984) . 

Another  advantage  that  reportedly  can  be  realized  from 
computer-assisted  testing  is  increased  accuracy.  According 
to  Byers   (1981),  errors  caused  by  the  examinee's  answer 
sheet  and  test  booklet  being  out  of  synchronization  are 
eliminated  since  computer-assisted  test  items  are  presented 
one  at  a  time.     Also,  with  particular  classes  of  tests  or  in 
particular  test  situations,  psychological  factors  can  bear 
heavily  on  the  results,  and  because  many  subjects  claim  to 
respond  more  honestly  to  a  computer  terminal  than  to  an 
examiner   (Moreland,   1985) ,  increased  accuracy  may  be 
significantly  improved  through  the  use  of  computer-assisted 
systems . 
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Sampson  (1983b)   reported  that  error  rates  may  also  be 
reduced  with  computer-assisted  assessment  systems  by 
exploitation  of  the  potential  for  standardizing  both  test 
administration  and  scoring  procedures.     According  to  studies 
by  Cohen  and  by  Marling   (cited  in  Anastasi,  1982)  ,  there  is 
considerable  evidence  that  the  results  obtained  from  an 
individually  administered  tests  may  vary  systematically  as  a 
function  of  the  examiner.     These  differences  may  be  related 
to  personal  characteristics  of  the  examiner,  such  as  sex, 
race,  appearance,  and  personality,  as  well  as  the  type  of 
examiner-examinee  relationship  established.     Exner   (cited  in 
Anastasi,  1982)   found  that  significant  differences  in 
performance  on  intelligence  tests  were  obtained  as  a  result 
of  a  "warm"  versus  a  "cold"  interpersonal  relationship 
between  examiner  and  examinee.     Any  of  these  variables  may 
influence  to  some  degree  the  testing  results.     Because  of 
these  examiner-related  variables,  some  assessment  experts 
have  contended  that  computer-assisted  testing  systems  offer 
a  more  standardized  testing  experience  than  conventional 
methods   (Sampson,  1983b;  Space,   1981) . 

The  convenience  associated  with  computer-assisted 
methods  is  another  reported  advantage.     Ward  (1984)  claimed 
that  problems  typically  associated  with  scheduling  and 
administration  of  tests  have  been  significantly  lessened. 
Furthermore,  because  computer-assisted  assessment  systems 
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have  been  found  to  facilitate  the  test  administration,  data 
collection,  and  data  analysis  processes,  opportunities  for 
increased  testing  present  themselves   (Calvert  &  Waterfall, 
1982)  .     This  feature  may  be  of  particular  importance  since 
the  most  common  finding  from  studies  involving  the  use  of 
microcomputers  in  the  classroom  is  that  the  amount  of 
learning  dramatically  increases  when  students  are  tested 
frequently   (Okey  &  McGarity,   1982) . 

Another  aspect  of  computer-assisted  testing  is  that 
examinees  generally  show  a  preference  for  computer-assisted 
testing.     Lukin  et  al.    (1985)   claimed  that  85%  of  college 
students  surveyed  were  reported  to  prefer  computer-assisted 
testing  over  conventional  methods  of  testing.     White  et  al. 
(1985)    found  that  the  majority  of  undergraduates  preferred 
computer-assisted  formats  over  conventional  paper  and  pencil 
formats  when  taking  the  MMPI.     In  addition  to  being  more 
satisfying  for  many  users,  computer-assisted  testing  systems 
are  reported  to  be  highly  motivating  for  some  individuals 
(Hunt  &  Pellegrino,  1984)  .     Computer-controlled  tests  can  be 
presented  in  a  "game-like"  format  and  have  been  found  to  be 
self-motivating  (Loftus  &  Loftus,   1983)  .     Since  motivation 
is  believed  to  be  instrumental  in  test  performance,  perhaps, 
for  at  least  some  individuals,  computer-assisted  systems 
measure  true  ability  more  accurately  that  conventional 
systems. 


still  other  advantages  include  the  rapid  scoring  and 
reporting  features.     The  opportunity  for  immediate  feedback 
and  knowledge  of  results  facilitates  the  overall  decision 
making  process   (Sampson,  1983b;  Wood,   1984) .  However,  the 
opportunity  for  adaptive  testing  is  considered  to  be  the 
motivating  force  behind  the  accelerated  usage  of 
computer-assisted  testing   (Weiss,  1983).     Adaptive  (or 
interactive)   assessment  systems  can  be  programmed  to  track 
an  individual's  progress  during  the  testing  process  and 
present  items  at  a  level  of  difficulty  commensurate  with  the 
subject's  demonstrated  ability.     In  addition,  systems  have 
been  developed  to  assess  problematic  areas,  provide  a 
detailed  diagnosis  of  the  topics  shown  to  present 
difficulty,  and  provide  tailor-made  remedial  instruction  for 
problem  areas   (Okey  &  McGarity,   1982)  . 

Hensen  et  al.    (1977)    found  some  of  the  very  early 
adaptive  testing  systems  to  be  as  valid  as  conventional 
systems  and  significantly  more  efficient  in  terms  of  time. 
Kingsbury  and  Weiss   (1981)   compared  conventional  mastery 
tests  with  adaptive  mastery  tests  and  found  adaptive  testing 
not  only  reduced  test  time  requirements  generally,  but  also 
resulted  in  consistently  higher  proportions  of  correct 
classifications  concerning  mastery  status  than  the 
conventional  testing  procedure  when  classroom  performance 
was  used  as  a  criterion  measure. 
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Sanders   (198  5)   found  that  the  MMPI  could  also  be 
significantly  shortened  without  losses  in  reliability  when 
administered  by  an  interactive  computer  system  that 
individualized  the  number  of  items  needed  to  produce  a  full 
scale  MMPI  profile.     Some  of  the  earliest  studies 
demonstrated  that  when  interactive  systems  were  employed,  up 
to  50%  reduction  in  test  length  relative  to  conventional 
versions  could  be  realized  without  any  loss  of  precision 
(Waters,  1977). 

Hunt  and  Pellegrino   (1984)    found  that  some  adaptive 
systems  possess  the  capacity  both  to  identify  a  user's 
problem-solving  style  and  then  adapt  to  it.     By  virtue  of 
the  interactive  process  that  exists  between  the  computer  and 
user,  an  increasingly  clearer  picture  of  the  user's 
problem-solving  style  can  be  obtained.     Once  a  style  has 
been  identified,  items  can  then  be  presented  in  a  format 
that  not  only  coincides  with  the  user's  style  but 
facilitates  the  overall  assessment  process. 

Some  researchers  claim  that  interactive  systems  may 
actually  improve  one's  problem-solving  strategies.  For 
example.  Linn  (1985)   claimed  the  capacity  for  exact  and 
immediate  interactive  feedback  provides,  in  certain 
situations,  the  potential  for  the  user  to  test  solutions, 
modify  them,  and  test  them  again.     This  independent- learning 
feature  has  been  shown  to  enhance  the  development  of  higher 


order  cognitive  skills  and  problem-solving  abilities 
(Mandinach  &  Linn,  1986;  Perkins,   1985;  White,  1987). 

The  opportunity  for  additional  control  over  learning 
and  feedback  of  results  has  contributed  to  the  increasing 
popularity  of  interactive  systems   (White,   1987)  .  Weiss 
(1983)   found  that  more  than  90%  of  the  students  reacted 
favorably  to  the  provision  of  immediate  knowledge  of 
results.     Weiss   (1983)   also  claimed  that  adaptive  testing 
with  or  without  immediate  knowledge  of  results  generally 
created  a  psychological  environment  for  testing  that  is  more 
equivalently  motivating  for  all  users,  regardless  of  ability 
level.     This  results  in  a  greater  standardization  of  the 
test-taking  environment  than  that  commonly  found  under 
conventional  testing  conditions. 

Interactive  systems  are  also  currently  being  employed 
in  lieu  of  the  traditional  psychological  interview  to  obtain 
and  assess  historical  and  problem-related  information  in 
order  to  establish  therapeutic  goals   (Sampson,  1983a) . 
Although  the  traditional  interview  is  still  widely  used 
today,  several  negative  factors  regarding  its  use  have  been 
reported.     First,  the  interview  process  consumes  a  large 
amount  of  professional  staff  time  and,  as  such,  is  costly. 
And  according  to  Sampson   (1983a)   consistency  between  staff 
members  conducting  traditional  interviews  is  low. 
Interactive  interviewing  systems,  on  the  other  hand,  which 


have  been  developed  for  identifying  clients  who  exhibit  a 
high  risk  of  suicide,  have  been  reported  to  be  more  reliable 
than  clinicians  in  predicting  suicide  attempts  (Angle, 
Johnson,  Grebenkemper ,  &  Ellinwood,   1979)  .     In  addition,  the 
computer-assisted  interviewing  systems  have  been  preferred 
over  the  conventional  method  by  the  majority  of  the  clients 
(Angle  et  al. ,   1978)  . 

While  many  testing  experts  have  enumerated  the  various 
advantages  of  computer-assisted  testing,  some  point  to 
certain  disadvantages.     One  major  disadvantage  stems  from 
the  rapidly  expanding  amount  and  diversity  of  available 
software  for  these  applications.       Such  rapid  growth 
characteristically  brings  poor  quality  products  into  the 
marketplace  along  with  higher  quality  products.  This 
situation  has  been  exacerbated  by  those  individuals  who  are 
more  interested  in  profit  than  in  the  quality  of  software 
(Farrell,   1983)  .     According  to  Burkhead  and  Sampson   (1985)  , 
it  is  difficult  for  the  average  practitioner  to  identify 
software  that  is  suitable  to  his  or  her  purposes  from  among 
the  hundreds  of  programs  available,  some  of  which  are 
suitable  and  many  of  which  are  not.     Systems  that  fail  to 
incorporate  adequate  human  factor  provisions  can  be  both 
confusing  and  frustrating  to  the  user  (Johnson  &  Johnson, 
1981)   and  can  result  in  faulty  testing  services   (Burkhead  & 
Sampson,   1985)  . 
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Many  researchers  are  concerned  that  the  widespread 
availability  of  computer-assisted  tests  may  encourage 
professionals  to  become  overly  reliant  on  computer-generated 
results   (Matarazzo,  1983)  .     These  concerns  are  reinforced  by 
the  public's  perception  that  computer-generated  results  by 
definition  are  accurate  and  valid.     According  to  Cairo 
(1983)  ,  computer-assisted  testing  is  simply  not  appropriate 
for  all  situations  and  all  individuals.     For  example,  an 
individual  under  stress  may  not  be  capable  of  concentrating 
sufficiently  to  interact  with  a  computer  and  may  require  the 
support  and  encouragement  that  can  only  be  provided  by  a 
human  examiner.     Variables  such  as  the  emotional  state  and 
physical  limitations  of  the  examinee  must  be  considered 
before  computer-assisted  testing  is  employed.  Adequate 
screening  is  essential  to  ensure  valid  results  and  the 
welfare  of  the  examinee   (Sampson,  1983a) . 

Security  risks  are  considered  to  be  another  major 
disadvantage  of  computer-assisted  testing.     According  to 
Zachary  and  Pope   (1984)  ,  maintaining  the  confidentiality  of 
electronically  stored  test  data  can  be  difficult. 
Unauthorized  access  to  test  scores  has  increased 
dramatically  with  the  advent  of  computer  networking  of  large 
numbers  of  sophisticated  remote  terminal  users   (Burkhead  & 
Sampson,  1985;  Space  et  al.,  1980). 
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Another  risk  involves  the  potentially  inappropriate 
use  of  the  computer-assisted  test.     According  to  Zachary  and 
Pope   (1984)  ,  the  misuse  of  tests  by  both  unauthorized  users 
(individuals  who  lack  the  credentials  for  conducting 
psychological  tests)  or  unsophisticated  users  (individuals 
with  credentials  but  no  training  for  conducting  a  particular 
type  of  test)   is  far  greater  with  computer-assisted  systems 
than  with  conventional  tests.     Health  professionals  who  are 
interested  in  obtaining  psychological  evaluations  but  lack 
the  requisite  training  may  be  more  likely  to  make  use  of  the 
computer-assisted  version  than  the  conventional  version  of  a 
test   (Burkhead  &  Sampson,   1985) . 

Still  other  researchers  believe  that  the  greatest 
disadvantage  of  computer-assisted  tests  involves  the  changes 
that  can  acrue  when  the  tests  undergo  the  computer 
conversion  process.     Matarazzo   (1983)   claimed  that 
computerization  of  tests  may  alter  the  nature  of  tests. 
Hunt  and  Pellegrino   (1984)   cautioned  that  computer-assisted 
tests  could  change  the  psychological  talent  being  evaluated. 
In  addition,  they  could  selectively  change  the  construct  for 
all  people  being  evaluated  or  for  some  examinees  and  not 
others . 

According  to  Duthie   (1984) ,  differences  in  results 
obtained  between  the  conventionally  administered  test  and 
its  computer-assisted  counterpart  can  be  attributed  to  the 


examinee's  general  attitude  toward  computers.  Specifically, 
the  "mystique"  sometimes  associated  with  computers  may 
influence  the  response  set  of  the  examinee.     Duthie  (1984) 
also  reported  that  the  opportunity  for  reflection  time  on 
conventionally  administered  tests  is  lost  during  the 
computer-administered  version  and  that  fatigue  is  more  of  a 
factor  during  the  paper  and  pencil  testing  than 
computer-assisted  testing. 

Differences  in  test  formats  and  administration 
procedures  necessitated  by  the  conversion  process  from 
conventionally  administered  to  computer-assisted  versions  of 
the  same  tests  may  distort  test  results  and  evaluations. 
The  potential  for  distortions  between  versions  appears  to  be 
greater  when  individually  administered  tests  are  converted 
to  computer  for  administration  than  when  paper  and  pencil 
tests  are  converted.     This  can  be  attributed  to  three  basic 
differences  in  the  character  of  the  tests.     First,  the 
absence  of  rapport  with  a  human  examiner  may  alter  the 
testing  experience   (Burkhead  &  Sampson,  1985) .  Secondly, 
more  information  is  likely  to  be  presented  visually  on  the 
computer-converted  version  rather  than  on  an  orally 
administered  version.     This  difference  may  alter  the  testing 
experience  because  information  in  the  iconic  memory  store  is 
processed  differently  from  information  in  the  echoic  memory 
store   (Bower  &  Hilgard,  1981).     Finally,  because  more 
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information  is  likely  to  be  presented  visually,  more  reading 
is  generally  required  with  the  computer-assisted  systems 
than  with  individually  administered  tests. 

Computerized  adaptive  tests  introduces  still  another 
possible  source  of  variance.     Adaptive  tests  are  designed  to 
assess  the  individual's  estimate  of  ability  during  the 
testing  process.     This  estimation  process  permits  the 
administration  of  only  those  test  items  that  are  at  a 
difficulty  level  commensurate  with  the  examiner's 
demonstrated  performance. 

Adaptive  tests  are  constructed  using  a  variety  of 
models  to  both  calibrate  test  items  and  estimate  ability. 
Weiss   (1983)   reported  that  various  methods  of  constructing 
adaptive  tests  can  result  in  differences  in  item  pool 
configurations,  test  item  selection  procedures,  test  length, 
and  ability  estimates.     The  various  models  used  to  construct 
adaptive  tests  may  cause  differences  in  the  psychometric 
characteristics  of  the  test  (Weiss,  1983)  .     These  data  lead 
one  to  conclude  that  if  tests  that  are  constructed  using 
different  adaptive  models  generate  different  results  from 
one  another,  then  adaptive  tests  may  likely  generate 
different  results  from  their  conventional  counterparts. 

In  the  aggregrate,  these  findings  allow  one  to  suspect 
that  conversions  of  tests  to  computer  administration  may 
result  in  differences  in  the  testing  process  that  could 
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significantly  impact  on  examinees'  performances  and 
examiners'  evaluations.       An  understanding  of  the 
differences  incurred  during  the  computer  conversion  process 
as  well  as  an  appreciation  of  the  advantages,  disadvantages, 
and  suitability  of  computer-assisted  assessment  systems  are 
essential  if  appropriate  utilization  of  this  technology  is 
to  be  realized. 

Impact  of  Computer  Technology 
On  Future  Assessment 

Although  there  is  some  controversy  surrounding  the 
incorporation  of  computer  technology  into  the  testing 
fields,  most  researchers  are  confident  that  computer 
technology  will  not  only  ultimately  enhance,  but  also  expand 
upon  current  means  of  aptitude  testing.     Sternberg   (1986) , 
for  example,  predicted  that  computer  technology  will  provide 
the  requisite  means  to  measure  an  increased  spectrum  of 
traits  and  abilities.     Sternberg  further  expressed  the 
belief  that  the  capability  of  assessing  a  vast  array  of 
constructs,  currently  unmeasurable  by  conventional  means, 
will,  in  turn,   lead  to  a  need  to  expand  upon  what  is 
presently  conceptualized  as  intelligence. 

A  lesser  number  are  skeptical  about  alterations  to  and 
expansions  of  current  methods  of  aptitude  testing. 
Traditionalists  in  the  testing  field  have  long  believed  that 
a  test  is  a  measure  of  specialized  abilities  and 
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intelligence,  to  the  extent  that  it  correlates  with  the 
results  of  tests  that  have  already  received  acceptance. 
There  are  compelling  arguments  for  this  criterion  since 
educators  and  psychologists  have  developed  sound  concepts 
that  are  supported  by  a  wealth  of  interpretive  data  and 
based  on  years  of  collected  statistics.     Therefore,  this 
knowledge  and  interpretive  data  are  considered  especially 
important  when  tests  are  used  to  make  these  critical 
decisions  that  will  affect  the  course  of  people's  lives. 

Others,  such  as  Boring,  believe  that  "Intelligence  is 
what  the  test  measures"    (Boring,  1923,  p.  35).     Boring  also 
avowed  that  new  methods  of  evaluation  might  be  theoretically 
justified  and  should  not  be  discounted  simply  because  they 
differ  substantially  from  any  test  currently  in  use.  In 
keeping  with  the  essence  of  this  research,  even  if  obvious 
differences  exist  between  conventional  and 

computer-assisted  methods  and  a  high  degree  of  correlation 
is  not  achieved,  one  could  ask  whether  computer-assisted 
versions  are  not  only  quicker,  easier,  and  less  costly,  but 
perhaps  even  better  in  terms  of  measurement  accuracy?    As  a 
case  in  point,  the  Minnesota  Paper  Form  Board  test  was  shown 
to  expand  significantly  the  evaluation  of  spatial  ability. 
The  computer-assisted  version  permitted  analyses  of  errors 
and  reaction  times,  which  made  it  possible  to  extend  the 
assessment  process  and  to  obtain  a  clear  picture  of  the 
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difficulty  that  examinees  had  with  each  component  of  the 
spatial  task   (Hunt  &  Pellegrino,  1984). 

Currently,  a  number  of  assessment  experts'  views  are  in 
consonance  with  Boring's   (1923)   belief  that  new 
opportunities  in  testing  methodology  warrant  exploration  and 
consideration,  even  though  correlations  with  traditional 
methods  are  non-existent.     Many  counselors  and  educators 
also  contend  that,  if  current  technology  were  applied  to  the 
assessment  field  judiciously,  the  quality  of  cognitive 
assessment  could  be  significantly  improved.     Hunt  (1982) 
claimed  that  there  are  many  potential  advantages  to  be 
realized  by  applying  computer  technology  to  assessment 
practices.     More  broadly.  Hunt  has  stated,  "Any  application 
of  science  should  be  reviewed  periodically  to  ensure  that  it 
is  based  on  the  best  science  and  use  of  technology"  (Hunt, 
1982,  p.   235) . 

Several  researchers,  who  concur  with  Hunt's   (198  2) 
philosophy  and  advocate  taking  maximum  advantage  of  state  of 
the  art,  have  reviewed  the  latest  technology  to  determine 
how  recent  developments  in  the  assessment  field  can  be 
exploited.     As  a  result  of  these  reviews,  a  number  of 
enhancements  have  been  realized.     For  example,  utilization 
of  microcomputers  augmented  with  special  input  devices  such 
as  joysticks,   light  pens,  touch  sensitive  screens,  and 
voice-operated  devices   (Burkhead  &  Sampson,  1985) ,  has  made 
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it  possible  for  the  individual  with  visual,  auditory,  and 
physical  limitations  to  complete  various  tests  with  minimal 
staff  assistance   (Sampson,  1983b) .     Other  findings  allow  for 
the  conclusion  that  some  of  the  newer  computer-assisted 
systems  have  the  potential  not  only  to  facilitate  the 
assessment  process  but  also  to  expand  the  scope  of  cognitive 
abilities  that  can  be  assessed. 

It  has  become  abundantly  clear  that  computer  technology 
will  alter  the  course  of  assessment.     One  area  of  aptitude 
testing  that  has  already  received  considerable  attention  is 
that  of  spatial  ability  evaluations.     Studies  conducted  by 
Pellegrino  and  Kail   (cited  in  Hunt  &  Pellegrino,   1984)  have 
demonstrated  that  computer-assisted  testing  can  expand  upon 
the  information  currently  obtainable  on  conventional  spatial 
ability  assessment.     Most  of  the  currently  available 
spatial-visual  tests  measure  the  encoding  and  visualization 
processes  together.     Pellegrino  and  Kail  demonstrated  that 
it  is  possible  to  obtain  independent  measures  of  both  the 
visualization  and  encoding  factors.     This  is  accomplished  by 
using  the  computer  adaptive  testing  format  to  present 
successive  variations  of  a  given  problem  in  a  way  that 
increases  the  complexity  of  the  problem  along  either  the 
encoding  or  visualization  dimension. 

According  to  Hunt  and  Pellegrino   (1984) ,  computer 
technology  also  offers  the  potential  for  expanding  upon  our 
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current  concepts  of  spatial  ability  testing.  While 
traditional  spatial  ability  tests  involve  measuring  one's 
ability  while  looking  at  a  display,  a  better  indicator  of 
one's  spatial  ability  might  be  obtained  by  measuring  one's 
"geographic  orientation"  -  one's  ability  to  orient  oneself 
to  the  surroundings.     It  is  contended  that  geographic 
orientation  could  be  measured  relatively  easily  by  using 
computer-controlled  video  disc  devices   (Hunt  &  Pellegrino, 
1984) .     Individuals  would  be  presented  visual  inputs  of 
imaginary  walks,  drives,  or  other  scenes  and,  subsequently, 
be  tested  on  the  detail  of  the  surroundings. 

It  is  also  expected  that  computer-assisted  testing  will 
eventually  be  used  to  evaluate  reading  comprehension  skills. 
Prototype  computer-controlled  testing  procedures,  designed 
to  analyze  various  aspects  of  reading  comprehension,  have 
been  developed   (Hunt  &  Pellegrino,   1984) .     These  procedures 
theoretically  could  be  included  as  part  of  a  computer- 
assisted  multiple  aptitude  scale  or  intelligence  scale  and 
yield  valuable  information  on  selective  language 
deficiencies  and  numerous,  specific  aspects  of  reading 
comprehension . 

The  incorporation  of  video  disc  players  into 
microcomputer  systems  has  now  made  it  technologically 
feasible  to  assess  the  comprehension  ability  of  the 
non-reader  by  presenting  action  sequences  in  lieu  of 
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narrative  passages.     This  feature  has  received  much 
attention  because  it  permits  administering  comprehension 
tests  to  those  with  reading  disabilities  and  has  expanded 
the  scope  of  comprehension  tests  in  general.     Also,  because 
reading  ability  will  no  longer  be  a  prerequisite  for  these 
types  of  computer-assisted  tests  that  measure  comprehension 
as  well  as  other  skills,  the  content  validity  of  these  tests 
will  not  be  compromised.  ,^ 

Another  area  in  which  computer-assisted  testing  is 
expected  to  expand  upon  current  levels  of  testing  is  that  of 
learning  ability.     Learning  ability  is  an  important  aspect 
of  mental  competence   (Sternberg,   1981) ;  however,  it  is 
usually  not  included  in  aptitude  batteries  and  intelligence 
tests.     Without  computer  assistance,  it  is  very  difficult  to 
test  one's  ability  to  learn  complex  material.     It  appears 
that  systematic  testing  of  the  learning  ability  could  be 
incorporated  easily  into  computer-assisted  instructional 
programs   (Hunt  &  Pellegrino,  1984) .     The  assessment  would  be 
made  by  the  computer  based  on  the  speed  of  learning  new 
material  as  it  is  presented  by  the  computer.     If  this  were 
realized,  it  would  present  opportunities  for  testing  one's 
general  ability  to  learn  in  specific  situations  or  under 
various  conditions. 

Computer-assisted  testing  will  also  permit  assessment 
of  the  examinee's  reaction  times.     According  to  Boretz, 
Brown,  and  Kahn   (1984)  ,  there  appears  to  be  a  statistically 
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significant  relationship  between  the  speed  at  which  an 
individual  is  able  to  process  stimuli  and  that  individual's 
level  of  intellectual  functioning.     If  reaction  time  proves 
to  be  a  reliable  indicator  of  a  specialized  ability  area, 
then  computer-assisted  systems  could  be  employed  to  present 
stimuli,  record  response  time  at  millisecond  accuracy,  and 
measure  that  aspect  of  cognitive  functioning. 

Weiss   (1983)   claimed  that  computer-assisted  systems 
will  facilitate  development  of  a  wide  range  of  new  kinds  of 
tests  that  can  supplement  the  standard  dimensionality-based 
tests  currently  in  use.     For  example,  Weiss   (1983)  contended 
that  interactive  administration  of  problem-solving  types  of 
tests  can  yield  abundant  data  on  the  examinee's 
problem-solving  styles  and  response  latencies.  These 
attributes  cannot  be  readily  assessed  using  conventional 
testing  techniques.     Weiss   (1983)   also  foresaw  that 
computer-assisted  assessments  would  expand  current  ways  of 
measuring  perceptual  and  memory  abilities,  as  well  as  a 
number  of  other  aptitudes. 

Another  projected  computer-assisted  assessment 
application  that  appears  to  offer  much  promise  involves 
cross-cultural  testing.     The  testing  of  individuals 
from  highly  dissimilar  cultural  backgrounds  has  received 
increasing  attention  since  mid-century   (Anastasi,  1982). 
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Because  current  tests,  which  measure  standard  (American) 
English,  are  undeniably  biased  toward  measuring  cultural 
exposure,  they  cannot  be  considered  "fair".       If  tests  were 
administered  by  computers,  it  would  be  possible  to  probe 
knowledge  of  specialized  vocabularies  by  formulating  test 
items  using  words  that  are  commensurate  with  an  individual's 
cultural  background  (Hunt  &  Pellegrino,  1984)  . 

Based  on  these  findings,  there  appear  to  be  many 
potential  opportunities  that  could,  one  day  in  the  not  too 
distant  future,  be  realized  by  the  application  of  computer 
technology  to  assessment.     Computer-assisted  testing  seems  to 
afford  a  feasible  approach  for  testing  aspects  of  cognitive 
functioning  that  are  currently  unmeasurable ,  for  expanding 
the  scope  of  testing  to  handicapped  individuals,  and  for 
extricating  some  of  the  cultural  bias  currently  embedded  in 
conventional  tests  and  test  methods.     A  thorough  study  must 
be  conducted  to  ascertain  the  feasibility  of  achieving  these 
ends  through  computer-assisted  testing.     If  feasibility  is 
substantiated,  arguments  must  be  put  forth  for  further 
development  and  application  of  these  test  methods. 

Research  Data  on  Computer-Assisted  Assessment  Systems 

Despite  the  numerous  reported  advantages  of  employing 
computer-assisted  assessment  systems  and  the  promise  that 
many  of  the  proposed  applications  hold,  there  has  been 
little  substantive  research  conducted  to  justify  their  use. 
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While  computer  technology  is  being  applied  to  a  number  of 
assessment  areas,  the  validity  of  various  applications  rests 
on  limited  research  (Burkhead  &  Simpson,   1985) . 

One  of  the  primary  ways  in  which  this  technology  has 
been  applied  is  in  the  direct  conversion  of  conventional 
tests  to  computer  administration.     Although  it  is  estimated 
that  a  large  number  of  tests  have  undergone  conversion,  it 
is  speculated  that  only  a  small  fraction  have  been  checked 
for  equivalency.     According  to  APA's  Guidelines  for 
Computer-Based  Tests  and  Interpretations   (1987) ,  the 
reliability  and  validity  of  computerized  versions  of  tests 
cannot  be  guaranteed  unless  appropriate  and  rigorous 
equivalency  studies  have  been  conducted. 

The  few  investigations  that  have  been  made  have 
generated  mixed  results.     For  example,  two 

neuropsychological  assessment  tools,  the  Halstead  Category 
test  and  the  Finger  Tapping  test,  were  converted  from  the 
conventional  methodology  to  computer-administration  and 
checked  for  equivalency.     According  to  Honaker  and  Harrell 
(1987) ,  initial  research  indicated  that  the 
computer-converted  version  of  the  Finger  Tapping  test  was 
not  equivalent  to  the  conventional  form.     On  the  other  hand, 
the  converted  Halstead  Category  test,  which  employed  visual 
rather  than  auditory  reinforcement  like  the  conventional 
version,  was  still  found  to  generate  results  equivalent  to 
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the  original  version   (Wood  &  Strider,   1980)  .     Kingsbury  and 
Weiss   (1981)   compared  computer-assisted  adaptive  and 
conventional  ability  test  methods  in  terms  of  reliability 
and  concurrent  validity.     It  was  determined  that  while 
ability  estimates  were  comparable,  the  concurrent  validity 
analysis  showed  that  the  conventional  test  produced  ability 
estimates  that  correlated  more  highly  with  the  criterion 
test. 

Equivalency  studies  related  to  paper  and  pencil 
assessment  tools  have  also  been  conducted.     Lukin  et 
al.    (1985)   compared  computer  and  conventional  paper  and 
pencil  methods  of  administration  for  three  personality 
assessment  instruments   (i.e.,  the  State-Trait  Anxiety 
Inventory,  the  Therapeutic  Reactance  Scale,  and  the  Beck 
Depression  Inventory)   and  found  both  methods  of 
administration  equivalent  for  all  three  instruments.  White 
et  al.    (1985)   compared  a  computer-administered  version  with 
the  conventionally  administered  version  of  the  MMPI  and 
found  that  the  two  methods  do  produce  comparable  results. 
Computer-assisted  and  conventional  paper  and  pencil  versions 
of  the  Raven  Matrix  Test  were  also  found  to  be  essentially 
equivalent   (Calvert  &  Waterfall,   1982) . 

McBride   (1986)   reported  the  results  of  two  field 
studies  in  which  he  investigated  the  equivalency  of  the 
computerized  DAT  at  the  Annual  APA  Convention.  McBride 
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concluded  from  his  study  that  all  but  one  subtest  could  be 
considered  psychometrically  equivalent  to  their  printed 
counterparts.     Since  this  particular  computerized  adaptive 
test  is  the  subject  of  the  research,  it  will  be  addressed  in 
more  detail  in  the  subsequent  section.      \  ^  ''. 

Despite  the  need  to  comply  with  the  guidelines 
established  by  APA  (1987)  ,  there  appears  to  be  a 
disproportionate  number  of  tests  that  have  been  converted  to 
computer  administration  in  comparison  to  the  number  of 
parity  studies  that  have  been  reported.     Honaker  and  Harrell 
(1987)   claimed  that  because  there  is  little  normative  or 
research  data  to  support  their  use,  current  findings  should 
be  regarded  as  tentative.     Additional  research  to  gain  a 
clearer  understanding  of  the  differences  that  may  be 
incurred  as  a  result  of  the  conversion  process  is  essential 
if  appropriate  utilization  of  this  technology  is  to  be 
realized. 

Literature  Directly  Related  to  the  Research  Problem 
The  specific  focus  of  this  research  was  to  ascertain 
the  equivalency  of  the  conventional  and  computerized 
versions  of  the  DAT.     To  date,  the  research  conducted  by 
McBride   (1986)    is  the  only  published  documentation  on  this 
subject. 

McBride  (1986)  reported  in  this  important  initial 
research  that  the  degree  of  correspondence  between  the 
computerized  adaptive  DAT  and  their  printed  counterparts  was 
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established  by  conducting  correlational  studies. 
Specifically,  scores  from  the  computer  converted  Form  V  were 
correlated  with  the  conventional  version  of  the  alternate 
Form  W.     Correlation  analyses  were  conducted  for  each  of  the 
subtests  for  the  eighth  through  the  twelveth  grades.  The 
correlation  coefficients  for  ninth  grade  students  ranged 
between  .79  and  .89.     Previous  comparisons  of  the  two 
conventional  Forms  V  and  W  yielded  correlation  coefficients 
between  .66  and  .90  for  a  ninth  grade  population.  This 
raises  the  concern  that  the  scaling,  which  was  performed, 
was  based  on  versions  that  were  themselves  somewhat 
inconsistent. 

According  to  the  guidelines  established  by  APA  (1987) , 
scores  from  the  computerized  version  of  conventional  tests 
may  be  considered  equivalent  to  conventional  scores  only 
when  the  following  two  conditions  have  been  met.     First,  the 
rank  orders  of  scores  of  individuals  tested  in  alternative 
modes  must  clearly  approximate  each  other.     Second,  the 
means,  dispersion,  and  shape  of  the  score  distribution  must 
be  approximately  the  same  or  have  been  made  the  same  by 
rescaling  the  scores  from  the  computerized  mode. 

In  keeping  with  the  APA   (1987)   guidelines,  an 
alternative  approach  would  entail  the  following  features: 
1)  establish  two  equivalent  groups  through  randomization 


61 

techniques;  administer  the  conventional  Form  W  to  both 
groups  to  verify  group  equivalence;  and  2)   administer  the 
conventional  Form  V  to  one  group  and  the  computerized  Form  V 
to  the  other  and  analyze  results  to  determine  whether  the 
computerized  test  yields  comparable  results  to  the 
conventional  Form  V.     This  alternative  approach  has  the 
advantage  that  one  can  distinguish  between  the  effects 
of  converting  from  conventional  to  computer  administration 
testing  from  the  effects  of  converting  from  one  form  of  the 
conventional  test  to  another  form  of  the  conventional  test. 

A  second  concern  in  examining  only  the  amalgamation  of 
scores  when  comparing  the  conventional  Form  W  to 
computerized  Form  V  is  that  the  amalgamation  process  could 
mask  some  inherent  bias  against  some  particular  group  or 
groups.     For  example,  the  academically  gifted  examinees 
would  be  expected  to  adapt  more  readily  to  computer 
administration  and,  therefore,  would  be  less  affected  than 
their  less  gifted  counterparts  to  a  new  experience  and  a  new 
testing  environment.     Alternatively,  it  could  be  that  those 
who  have  had  considerable  experience  with  computers  would 
hold  a  decided  advantage  in  taking  the  computerized  Form  V 
over  those  with  little  or  no  previous  experience. 

In  summary,  the  alternative  approach  as  discussed  above 
would  allow  for  the  compartmentalization  of  the  variance  in 
the  following  distinct  parts:     1)   variance  due  to  the 
computer-conversion  process;  and  2)  variance  due  to  the 
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difference  between  forms.     The  design  would  also  allow  for 
the  incorporation  of  a  blocking  variable  so  that  the  scores 
of  the  academically  talented  and  the  average  students  may  be 
examined  separately.     This  approach  would  also  incorporate 
procedures  to  meet  the  specific  requirements  of  the  APA 
(1987)   guidelines.     An  examination  of  rank  order  scores 
would  be  conducted  to  establish  what  degree  the  computerized 
mode  approximated  the  conventional  mode  of  test 
administration.     In  addition,  the  means,  dispersions,  and 
shapes  of  scores  for  the  two  modes  would  also  be  established 
to  investigate  the  issue  of  equivalency. 

Summary 

Chapter  II  presented  a  review  of  the  literature  related 
to  the  area  of  computer-assisted  testing.     In  the  first 
section,  the  review  began  with  an  overview  of  the  various 
categories  of  computer-assisted  applications  that  are 
currently  employed  in  the  counseling  and  educational  arenas. 
This  is  followed  in  the  second  section  by  a  recapitulation 
of  the  reported  advantages  and  disadvantages  of 
computer-assisted  assessment  systems.     The  third  section 
contained    a  summary  of  some  of  the  latest  developments  in 
the  field,  and  speculation  on  future  developments  in 
computer-assisted  testing  and  how  these  developments  can  be 
employed  for  the  enhancement  of  current  assessment 


practices.     The  fourth  section  included  a  review  of  the 
findings  on  research  investigations  that  specifically 
examined  the  equivalency  of  computer-converted  tests  with 
their  conventional  counterparts.     The  final  section 
contained  a  perspective  of  those  aspects  of  the  research 
problem  that  require  further  investigation  in  order  to 
answer  key  research  questions. 


CHAPTER  III 
METHODOLOGY 


The  primary  goal  of  this  experimental  research  was  to 
investigate  whether  computer-assisted  methods  of  aptitude 
testing  generate  significantly  different  test  results  from 
conventional  methods  of  aptitude  testing.       Chapter  III 
includes  a  description  of  the  test  instruments,  sample 
population,  research  design,  and  research  hypothesis. 
Additionally,  an  explanation  of  how  the  research  was 
conducted,  and  data  collection  and  analysis  procedures  are 
provided. 

Selected  Aptitude  Battery 
General  Description 

The  assessment  tool  selected  for  this  study  was  the 
Differential  Aptitude  Tests   (DAT) .     The  DAT  is  one  of  the 
most  widely  used  multiple  aptitude  batteries.     It  was  first 
developed  in  1947  for  the  purpose  of  providing  a  well 
standardized  procedure  for  measuring  the  abilities  of 
students  in  grades  8  through  12   (Anastasi,  1982) .  While 
this  battery  was  constructed  primarily  for  use  in  the 
educational  and  vocational  counseling  of  students  in  junior 
and  senior  high  schools,  it  has  also  been  used  by  industry 
and  government  in  the  selection  of  employees  (Bennett, 
Seashore,   &  Wesman,   1982)  . 
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The  DAT  is  classified  as  a  group  test,  which  allows  for 
the  opportunity  of  assessing  many  individuals 
simultaneously.     A  group  test,  unlike  the  individual  test, 
eliminates  the  need  for  a  one-to-one  relationship  between 
the  examiner  and  the  examinee  by  employing  printed  test 
items  exclusively  and  by  requiring  only  simple  responses 
that  can  be  recorded  on  an  answer  sheet  or  test  booklet 
(e.g.,  multiple  choice  items).     Because  the  examiner's  role 
is  minimized,  the  conditions  under  which  testing  is 
conducted  are  more  uniform  during  group  tests  than  for 
individual  tests.     Group  tests  also  differ  from  individual 
tests  in  the  organization  of  test  items.     While  test  items 
on  individual  tests  are  generally  organized  according  to  age 
levels,  test  items  on  group  tests  are  organized  with  respect 
to  subject  areas. 

The  DAT  measures  and  yields  scores  in  the  following 
eight  subject  areas:     verbal  reasoning,  numerical  ability, 
abstract  reasoning,  clerical  speed  and  accuracy,  mechanical 
reasoning,  space  relations,  spelling,  and  language  usage. 
The  scope  of  subject  areas  encompassed  by  the  DAT  battery 
allows  for  discrimination  among  those  subject  areas  and 
provides  an  opportunity  to  identify  those  particular  skills 
and  abilities  that  can  be  successfully  measured  by 
computer-assisted  test  methods. 
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Psychometric  Properties 

Anastasi   (1982)   claimed  that  the  DAT  is  a  technically 
sound  instrument.     The  1982  norms  are  based  on  a  highly 
representative  sample  of  students  in  grades  8  through  12  in 
the  United  States.     In  order  to  ensure  a  sample  that  would 
reflect  the  entire  student  population,  a  two-stage  procedure 
was  followed  (Bennett  et  al.,   1982).     First,  a 
representative  sample  of  U.S.  school  districts  was 
identified.     Second,  within  each  ot  the  selected  districts, 
groups  of  students  were  chosen  who  were  considered 
representative  of  the  district's  student  population  at  each 
of  the  five  grade  levels  involved  in  the  standardization 
testing.     The  demographic  variables  of  family  income  and 
education  as  well  as  ethnic  classification  were  considered 
during  the  sampling  process  to  ensure  that  the  composition 
of  the  standardized  example  was  in  consonance  with  the 
national  figures.     The  resultant  normative  sample  consisted 
of  students  from  64  school  districts  in  32  states  (Bennett 
et  al. ,  1982) . 

Anastasi   (1982)   reported  that  the  DAT  is  a  very 
reliable  instrument.     Split-half  techniques  were  used  to 
estimate  the  reliability  of  the  power  tests  and 
alternate-form  reliability  coefficients  were  calculated  for 
the  one  speed  test.     Reliability  coefficients  were 
calculated  for  all  tests  at  each  grade  level  for  males  and 
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females  separately  and  range  from  .82  to  .97   (Bennett  et 
al. ,  1982)  . 

In  terms  of  validity,  the  amount  of  data  on  the  DAT  are 
considerable.     Anastasi   (1982)   claimed  that  there  are 
several  thousand  correlation  coefficients  concerned  with 
predictive  validity  of  the  DAT  in  terms  of  high  school 
achievement  in  both  academic  and  vocational  programs. 
The  majority  of  these  coefficients  are  high,  even  with 
intervals  as  long  as  three  years  between  test  and  criterion 
data.     Perry  (1976)   claimed  that  DAT  scores  obtained  as 
early  as  the  8th  grade  were  found  to  be  significantly 
related  to  subsequent  scores  obtained  on  post-secondary 
vocational  training  program.     Bennett  et  al.    (1984)  reported 
that  there  is  a  general  pattern  of  substantial  correlations 
between  DAT  scores  and  other  measures  of  ability. 
Correlations  have  been  reported  by  Bennett  to  be  uniformly 
high  between  this  battery  and  other  tests  that  measure  the 
same  skills   (i.e..  Scholastic  Aptitude  Test). 

In  addition,  Tolbert   (1980)   claimed  that  the  DAT  test 
data  can  be  used  to  enhance  career  development  planning. 
Specifically,  the  DAT's  profile  of  scores  can  be  combined 
with  the  results  on  the  DAT's  Career  Planning  Program  (CPP) 
to  produce  a  computer-generated  analysis  of  occupational 
plans  and  goals.     The  DAT's  CPP  has  been  found  to  generate 
interpretations  equal  in  effectiveness  to  those  provided  by 
counselors   (Tolbert,  1980). 
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Computerized  Adaptive  Version 

Since  its  inception  in  1947,  the  DAT  has  undergone  a 
number  of  revisions  and  in  1986  it  was  adapted  for  computer 
administration.     The  DAT  computerized  adaptive  version  was 
developed  and  published  by  the  Psychological  Corporation  and 
is  the  first  commercially  published  computerized  adaptive 
test  in  the  educational  measurement  field.     This  adaptive 
version  uses  a  microcomputer  to  administer  and  score  the 
responses  and  print  the  results.     The  test  items  are 
displayed  on  the  computer  screen  and  the  answers  are 
inputted  into  the  system  via  the  keyboard.     One  of  the  two 
forms  of  the  printed  DAT,  the  Form  V,  was  selected  for  the 
computerized  adaptive  version. 

The  design  of  the  DAT  adaptive  test  encompasses  several 
very  important  technical  issues.     First,  like  most  recent 
adaptive  tests,   the  DAT  is  based  on  the  item  response  theory 
(IRT) ,  which  permits  the  administration  of  tests  that  are 
individually  tailored  to  the  ability  level  of  the  examinee. 
Specifically,  only  those  items  that  are  appropriate  to  the 
examinee's  ability  level  are  administered. 

The  Rasch  IRT  model  was  used  to  calibrate  all  the  test 
items  from  the  bank  using  response  data  from  the  1982 
standardization  of  DAT     Forms  V  and  W.     Owen's  Bayesian 
sequential  updating  procedure  was  selected  as  the  method  to 


estimate  the  examinee's  ability  after  answering  each 
question.     In  each  case,  the  mean  performance  of  the 
subject's  grade  level  in  the  standardization  of  the  printed 
DAT  was  used  as  the  Bayes'  prior  mean. 

A  modified  "maximum  information"  procedure  was  selected 
as  the  item  selection  method.     This  modified  procedure 
consists  of  preparing  item  selection  look-up  tables  that 
reflect  item  information  at  various  ability  levels.     At  each 
point  in  the  test,  either  the  "best"  item  is  selected,  or  a 
random  choice  is  made  from  among  the  2  to  5  best  items. 

This  item  selection  routine  is  employed  immediately 
following  the  ability  estimate  updating  process.  The 
pattern  of  response  evaluation,  ability  estimation,  and  item 
selection  is  repeated  until  the  test  termination  criterion 
has  been  met.     Termination  occurs  when  the  number  of  items 
administered  is  half  the  length  of  its  conventional 
counterpart. 

Bennett  et  al.    (1987)   claimed  that  by  administering 
only  those  items  that  are  appropriate  to  the  examinee ' s 
ability  level,  the  computerized  adaptive  test  allows  for 
reduced  test  length.     Another  advantage  of  the  computerized 
adaptive  version  is  that  it  permits  the  opportunity  to 
obtain  immediate  results.     Scores  can  either  be  displayed  on 
the  computer  screen  or  presented  in  printed  form  directly 
following  the  testing  process. 


Bennett  at  al.    (1987)   claimed  that  the  computerized 
adaptive  version  can  be  considered  to  be  a  parallel  form  of 
the  printed  edition  and  that  the  two  forms  can  be  used 
interchangeably.     Furthermore,  the  norms  established  for  the 
printed  versions  can  be  used  to  interpret  the  computerized 
adaptive  test  scores.     To  date,  documentation  regarding  the 
psychometric  properties  of  the  computerized  adaptive  version 
has  not  been  published  by  the  Psychological  Corporation. 
Selected  Instrument  to  Obtain  Personal  Information 
The  self-report  questionnaire  was  selected  as  the  means 
to  obtain  data  on  those  personal  variables  (e.g., 
attitude  toward  computers)   that  have  been  suggested  in  the 
literature  to  correlate  with  overall  differences  in  scores 
between  the  two  versions.     The  questionnaire  was 
administered  following  the  testing  effort  to  obtain 
information  on  the  following  issues:    (a)   prior  computing 
experience;    (b)  prior  keyboard  experience; 

(c)  perceptions  and  attitudes  regarding  computers;  and 

(d)  preference  regarding  working  with  machines  or  people. 

For  many  of  the  questions,  the  individual  was  asked  to 
respond  using  the  following  Likert-type  scale  (Anastasi, 
1976)   categories:      (a)   strongly  agree;    (b)   agree;  (c) 
undecided;    (d)   disagree;  and   (e)   strongly  disagree.  For 
example,  one  of  the  statements  that  the  examinees  were  asked 
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to  respond  to  in  the  above  manner  was  "I  enjoy  interacting 
with  computers . " 

It  was  envisioned  that  the  results  from  the 
conventional  and  computer-assisted  tests,  combined  with  data 
from  the  questionnaire,  would  provide  the  requisite 
information  to  investigate  the  key  issues.  Specifically, 
the  resultant  data  permitted  an  analysis  to  be  conducted  to 
determine  whether  significant  differences  generally  exist 
between  the  computer-assisted  and  conventional  versions,  or 
whether  differences  apply  only  to  specific  types  of  subtests 
or  to  individuals  with  certain  attributes. 

Population  and  Sample 

The  population  base  for  this  study  comprised  the  ninth 
grade  students  at  Florida's  Orange  County  Public  Schools. 
The  total  enrollment  of  the  1988-1989  freshman  class  is 
8,017  students. 

Orange  County  has  a  fast  growing  population  that  is 
very  mobile  and  transient  in  nature.     The  composition  of  the 
county's  population  is  diverse  in  terms  of  socioeconomic 
variables  such  as  cultural  background,  annual  family  income, 
and  parental  educational  level.     The  July  1989  Orange 
County  Public  Schools  Student  Enrollment  Summary 
showed  the  total  senior  high  school  population  to  be 
25,65  6.     The  breakdown  with  respect  to  race  is  shown  in 
Table  3-1. 
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Table  3-1 

Population  Breakdown 

Senior  High  School  Students  -  %  by  Race 

64.57% 
23.23% 

12.00%  ^  •  .    •  1  '  ^  •    .  ; 


In  terms  of  performance  on  standardized  tests,  Orange 
County's  mean  scores  closely  approximate  those  of  the  United 
States'  population.     The  mean  standardized  test  scores  for 
the  senior  high  school  population  and  the  national  mean 
scores  are  shown  in  Table  3-2. 

Table  3-2 

Scholastic  Aptitude  Test   (SAT)   Mean  Scores 

Measure  1988  Orange  County     1988  National 

Mean  Score  Mean  Score 


White 
Black 
Other 


Verbal  424  428 

Quantitative  475  476 


The  Senior  Exit  Surveys  have  indicated  over  the 
last  four  years  that  approximately  two-thirds  of  the  Orange 
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County  graduates  planned  to  attend  a  Florida  college  or 
university.     The  breakdown  of  senior  plans  for  the  past  four 
years  is  located  in  Table  3-3. 


Table  3-3 

Plans  of  Seniors 


Plans  Years      •  .-.^  ^  • 

1985  1986  1987  1988 

o  Full  or  part-time  56%  76%  59%  62% 

employment 

■'       ■  ■•  ^ .      *  \  \ ';.  , , 

o  To  attend  a  two  or  74%  67%  57%  60% 

four  year  public 
Florida  college 

o  To  attend  a  private  8%  6%  6%  6% 

Florida  college 

o  Full-time  homemaker  1%  1%  0%  1% 

o  Planning  to  enter  6%  10%  9%  9% 

military 


The  curricula  at  Orange  County  public  high  schools 
contains  programs  for  the  academically  talented,  the 
academically  disadvantaged,  as  well  as  programs  for  average 
students.     The  following  specific  programs  are  offered: 
education  for  the  gifted,  advanced  classes,  regular  classes, 
basic  classes,  educational  programs  for  learning  disabled 
students,  and  educational  programs  for  emotionally 
handicapped  students. 

To  be  included  in  the  sample,  students  only  needed  to 
be  members  of  the  Orange  County  public  school  ninth  grade 
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population.     No  particular  psychological  characteristics 
were  used  as  selection  criteria.     The  only  obvious  attribute 
common  to  all  participants  was  their  willingness  to  be 
tested. 

Approach  to  the  Study 
A  true  experimental  research  design  involving  the 
establishment  of  a  control  group  and  an  experimental  group 
was  selected  for  this  study.     This  was  done  in  order  to  be 
able  to  estimate  the  relative  impact  on  test  scores  caused 
by  changing  the  type  of  administration   (computer  or 
conventional)   and  to  distinguish  that  impact  from  the 
effects  of  changing  the  test  form  (V  or  W) .     According  to 
Isaac  and  Michael   (1983)  ,  a  true  experimental  design  has  the 
purpose  of  inV^estigating  possible  cause-and-ef feet 
relationships  by  exposing  an  experimental  group  to  a 
treatment  condition  and  comparing  the  results  with  those  of 
a  control  group  that  has  not  received  the  treatment.  For 
this  research,  the  "treatment"  is  the  computerization  of  the 
Form  V  and,  accordingly,  its  impact  on  the  student's  test 
results.     To  assess  the  impact  of  the  treatment,  the 
experimental  group  received  the  computerized  Form  V  and  the 
control  group  received  the  conventional  Form  V.  In 
addition,  both  the  control  and  experimental  groups  were  also 
administered  the  conventional  Form  W.     This  permitted  the 
verification  of  equivalency  of  the  control  and  experimental 
groups  and  a  delineation  of  the  effect  of  changing  test 
forms . 
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The  specific  experimental  approach  used  for  this 
project  was  the  multi-factor  design.     The  multi-factor 
design  permits  the  incorporation  of  additional  variables 
into  the  design.     The  additional  or  assigned  variables  are 
used  to  obtain  greater  homogeneity  within  test  groups  as 
well  as  to  gain  more  information  within  the  context  of  a 
single  experimental  design   (Kennedy  &  Bush,  1985) .  Thus, 
the  opportunity  exists  to  generalize  findings  across  the 
levels  of  the  assigned  variable.     The  multi-factor  design 
that  incorporates  an  assigned  variable  can  also  be  used  to 
effect  a  relative  reduction  in  unexplained  variance  and, 
thereby,  increases  the  efficiency  of  the  design. 

The  reduction  of  unexplained  variance  can  be 
accomplished  in  several  ways.     The  specific  strategy  that 
was  employed  for  the  purposes  of  this  study  involved  the 
establishment  of  matched  groups  by  incorporating  a 
concomitant  variable.     A  concomitant  variable  is  a  variable 
that  correlates  statistically  with  the  researcher's 
dependent  variables  and  is  used  as  the  basis  for  the 
subsequent  blocking  of  observations.     A  matched  group, 
according  to  Edwards   (1968) ,  is  one  in  which  subjects  are 
grouped  into  blocks  on  the  basis  of  their  expected 
homogenous  response  on  some  dependent  variable.     This  is 
superior  to  both  "matched  pairs"  and  completely  randomized 
groups.     Specifically,  Edwards  claims  that  matched  groups 
avoid  the  problems  typically  associated  with  matched  pairs 
(i.e.,  matching  can  only  guarantee  that  the  subjects  are 
equivalent  on  the  variable (s)  used  to  do  the  matching)  and 
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at  the  same  time  yield  an  expected  smaller  error  mean 
square  in  the  analysis  of  variance  for  the  same  number  of 
observations  than  completely  randomized  groups. 

In  this  study,  academic  tracks  was  selected  as  the 
concomitant  variable  since  it  generally  correlates  highly 
with  a  measure  of  one's  aptitude   (Anastasi,  1981).  The 
design  requires  that  students  be  blocked  according  to 
academic  tracks  and  then  randomly  assigned  in  equal  numbers 
to  either  the  experimental  or  control  group.     In  this 
design,  two  blocks  were  established;  advanced  and  average. 

The  blocking  variable  of  academic  track  is  crossed  with 
the  control  and  experimental  groups   (Variable  A) .     For  this 
experiment,  the  blocks   (Variable  B)  were  fixed.     The  first 
level  of  the  blocking  variable  was  the  advanced  academic 
achievers  and  the  other  level  of  the  blocking  variable  was 
the  average  academic  achievers. 

The  students  who  received  the  aptitude  tests  were 
nested  within  the  blocks.     Kennedy  and  Bush  (1985)  draw  the 
distinction  between  nested  and  crossed  in  that  a  nested 
group  receives  only  one  treatment  whereas  in  crossed 
experiments  the  subjects  are  exposed  to  all  levels  of 
treatment.     The  test  type   (Variable  C)  had  repeated 
measurements   (Form  V  and  W) .     Specifically,  there  are  two 
levels  of  Variable  A  (control  and  experimental);  two  blocks, 
advanced  and  average  achievers,  within  the  control  and 
experimental  groups;  and  eighty  total  test  scores   (Form  V 
and  W  for  each  subject) .     The  data  matrix  is  shown  in  Table 
3-4. 
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Table  3-4 
Data  Matrix 


Form  W  Form  V 


Advanced        S-  ^       ^  X,    ,   ,  ~ 


^i, 1,1,1  ^i,l,l,2 


CONTROL  S^Q     ^10,1,1,1  ^10,1,1,2 


Average        S^^    ^11,1,2,1  ^11,1,2,2 


^20     ^20,1,2,1  ^20,1,2,2 


Advanced        ^21     ^21,2,1,1  ^21,2,1,2 


EXPERIMENTAL       ^30     ^30,2,1,1  ^30,2,1,2 


Average      ^31      ^31,2,2,1  ^31,2,2,2 


^40     ^40,2,2,1  ^40,2,2,2 
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In  addition  to  employing  experimental  procedures, 
correlational  research  procedures  were  employed  to  compare 
the  rank-order  of  scores  for  the  two  methods  of  test 
admiminstration.     It  was  anticipated  that  the  experimental 
design  augmented  by  the  correlational  investigation  would 
produce  the  requisite  data  to  adequately  establish  the 
equivalence  of  the  computerized  version  in  accordance  with 
the  criteria  outlined  by  APA  (1987)  . 

A  secondary  goal  of  the  proposed  research  was  to 
examine  the  degree  that  ancillary  factors  corresponded  with 
variations  in  individual  performance.     To  gain  an 
understanding  of  the  interrelationship  between 
characteristics  of  participants   (e.g.,  degree  of  computing 
experience)   and  test  scores,  the  data  from  the 
self-reported  questionaire  were  analyzed  using  multiple 
linear  regression  techniques.     These  analyses  allowed  for 
the  identification  of  variables  that  either  individually  or 
collectively  correlated  with,  or  possibly  influenced,  test 
performance . 

Research  Hypothesis 
The  research  was  based  on  the  premise  that  differences 
in  testing  environments  and  methods  will  lead  to  differences 
in  performance.     Specifically,  the  primary  research 
hypothesis  was:     Computer-assisted  testing  methods  yield 


79 

different  results  from  those  obtained  from  the  use  of 
conventional  methods.     The  specific  null  hypotheses  tested 
across  each  of  the  subtests  are  listed  below. 

1.  There  is  not  a  significant  rank-order  relationship 
between  the  performance  ot  students  on  computerized  and 
conventionally  administered  tests. 

2.  There  is  no  significant  difference  between  the 
experimental  and  control  group  in  terms  of  mean  scores 
on  the  Form  W. 

3.  There  is  no  significant  difference  between  students 
means  scores  on  the  first  test  administration  and  the 
second  test  administration  of  Form  W  due  to  learning 
effect. 

4.  There  is  no  significant  difference  in  mean  scores 
between  computerized  and  conventional  versions  of  Form 
V. 

5.  There  is  no  significant  difference  between  the  advanced 
and  average  students  with  respect  to  ability  on  the 
computerized  tests. 

6.  There  are  no  significant  differences  in  the  shapes  of 
the  distributions  of  scores  for  computerized  and 
conventional  versions  of  the  test. 

7.  There  are  no  significant  differences  in  subtest 
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normative  scores  tor  computerized  versions  compared  to 
the  conventional  versions. 
8,      There  are  no  significant  influences  on  computerized 
test  performance  that  can  be  attributed  to  attitude 
toward  computers,  preference  for  dealing  with  machines, 
previous  computer  experience,  and  previous  typing 
experience . 

Relevant  Variables 
The  research  involved  independent,  dependent,  and 
concomitant  variables.     In  this  study,  the  independent 
variable  is  the  method  of  aptitude  testing.     The  two  levels 
of  this  factor  consisted  of  the  conventional  multiple 
aptitude  battery  and  the  computer-assisted  multiple  aptitude 
battery. 

The  following  dependent  variables  were  analyzed  to  test 
the  hypotheses:     a)   the  individual  subtest  scores  on  the 
computer-assisted  test  of  each  participating  respondent;  and 
b)   the  individual  subtest  scores  on  the  conventional  test  of 
each  participating  respondent.     The  third  type  of  variable 
in  the  study  is  the  concomitant  variable.     The  concomitant 
variable  that  was  used  in  the  study  is  the  subjects' 
academic  grouping  category  (i.e.,  advanced  and 
average) . 
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Sampling  Procedures  - 

The  subjects  for  the  study  were  a  volunteer  sample  of 
ninth  grade  students  who  attended  four  Orange  County  public 
high  schools.     Forty-four  students,  with  parental  consent, 
volunteered  to  participate  in  the  study.     A  copy  of  the 
Consent  Form  is  found  in  Appendix  A.     From  these  44 
volunteers,  20  students  were  from  advanced  tracks,  23 
students  were  from  regular  tracks,  and  one  student  was  from 
the  basic  track. 

The  experimental  research  design  required  that  two 
equivalent  groups  be  established.     In  addition,  the  design 
specified  the  use  of  a  blocking  variable  based  on  advanced 
and  average  academic  tracks. 

The  20  advanced  students  were  randomly  assigned,  in 
equal  numbers,  to  either  the  experimental  or  control  group. 
From  the  23  regular  students,  20  were  randomly  selected  by 
dropping  each  of  the  three  7th  names  on  the  list.  The 
resultant  20  regular  students  were  then  randomly  assigned  to 
one  of  the  two  groups  so  that  each  group  contained  a  total 
of  twenty  students.     Each  block  within  the  group  contained 
10  students  from  either  advanced  or  average  academic  tracks. 
Tests  for  group  and  block  equivalence  will  be  discussed  in 
the  data  analysis  section.  The  three  regular  students  whose 
names  were  dropped  from  the  list  and  the  one  basic  student 
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were  also  tested  and  provided  their  test  results  and 
concomitant  career  planning  reports;  however,  test  scores 
from  these  students  were  not  included  in  the  data  analysis. 
Data  Collection  and  Test  Procedures 

Potential  subjects  were  presented  a  description  of  the 
purpose  of  the  study  and  instructions  for  participating  in 
the  study.     This  was  followed  by  a  discussion  on  the  need 
for  the  consent  form.     Consent  forms  were  distributed  and 
subsequently  collected.     Only  the  subjects  who  had 
volunteered  and  who  had  parental  consent  were  allowed  to 
participate.     To  facilitate  the  testing  process,  and  in 
consonance  with  the  confidentiality  safeguards,  each  subject 
was  assigned  a  number.     All  hardcopy  data  for  each  student 
such  as  test  scores  and  completed  questionnaires  were 
identified  by  the  assigned  numbers  only. 

Students  assigned  to  the  control  test  group  were 
provided  detailed  instructions  as  specified  by  the 
Psychological  Corporation  on  how  the  test  would  be 
administered.     During  the  initial  testing  sessions,  half  of 
the  students  were  administered  conventional  Form  V  and  the 
other  half  were  administered  conventional  Form  W.  During 
the  subsequent  testing  sessions  students  were  administered 
the  form  they  had  not  taken  previously. 

Students  assigned  to  the  experimental  test  group 
received  detailed  instruction  on  how  the  computerized  and 
conventional  tests  would  be  administered.     During  the 
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initial  testing  sessions,  half  of  the  students  were 
administered  the  computerized  version  of  Form  V  and  the 
other  half  were  administered  the  conventional  Form  W. 
During  the  second  set  of  testing  sessions,  students  were 
administered  the  version  they  had  not  taken  previously.  The 
responses  to  the  conventionally  administered  tests  were 
recorded  in  test  record  forms.     The  responses  to  the 
computerized  version  were  recorded  electronically  and 
subsequently  printed  for  researcher  examination. 

Upon  completion  of  the  tests,  each  of  the  20  students 
in  the  experimental  test  group  was  asked  to  complete  the 
researcher  developed  questionnaire   (Appendix  B) .     At  the 
conclusion  of  the  activity,  all  hard  copy  test  data  and 
questionnaires  were  placed  in  envelopes  numbered  to 
correspond  with  numbers  that  had  been  previously  assigned  to 
each  subject.     After  thanking  the  individuals  for 
participating  in  the  study,  students  were  informed  as  to 
when  they  might  expect  the  test  results  and  career-planning 
reports . 

Data  Analysis 

The  test  data  and  survey  data  were  analyzed  using  both 
parametric  and  non-parametric  statistics.  Statistical 
computer  packages  were  employed  where  possible  to  facilitate 
the  analysis  and  preclude  the  potential  introduction  of 
numerical  errors.     In  addition  to  the  specific  statistical 
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tests  selected  for  examination  of  the  various  hypotheses, 
descriptive  statistics  and  graphs  were  made  for  each  group 
for  all  of  the  subtests. 

The  rank  order  equivalency  of  computerized  Form  V  was 
investigated  by  a  Spearman-Rho  correlational  analysis.  This 
procedure  was  performed  to  determine  the  relative  ranking  of 
individuals  within  the  experimental  group  on  the 
computerized  Form  V  and  conventional  Form  W.  For 
comparative  purposes,  a  similar  analysis  was  performed  for 
the  control  group.     In  addition,  a  standard  correlational 
analysis  was  made  on  individual  test  scores  for  each 
subtest.     These  data  were  also  compared  with  previously 
reported  correlational  analyses. 

The  hypothesis  relating  to  group  equivalency  was  tested 
using  a  two-way  analysis  of  variance  with  replication  and 
was  performed  for  each  subtest.     This  statistical  test 
applied  only  to  the  Form  W  scores  and  included  scores  from 
both  the  control  and  experimental  groups.     The  aspect  of 
learning  effect  was  also  included  in  this  analysis. 

The  hypothesis  related  to  test  score  equivalency  was 
tested  by  a  three-way  analysis  of  variance.     This  included 
control  and  experimental  groups,  advanced  and  average 
academic  achievers,  and  Forms  V  and  W.     Subsequently,  a 
Dunn's  test  was  used  to  determine  statistically  significant 
differences  between  cell  means.     This  sequence  of  analytical 
procedures  was  applied  to  each  of  the  subtests. 
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The  relative  equivalency  of  the  shape  of  the 
distribution  of  scores  was  determined  by  use  of  a 
Komogorov-Smirnov  goodness-of-f it  test.     This  included 
histogramming  techniques  for  both  the  conventional  and 
computerized  versions  of  Form  V  and  the  development  of  a 
cumulative  distribution  of  scores. 

The  adequacy  of  the  current  normative  scores  for  the 
application  of  the  computerized  Form  V  was  further  verified 
by  regression  analyses.     The  test  mode   (computer  or 
conventional)  was  a  blocking  variable  and  the  score  on  the 
Form  W  was  the  regressor  variable   (x) .     This  procedure  was 
applied  to  the  entire  sample  population  for  each  subtest. 
The  model  for  this  regression  analysis  was  : 

y  =         +  B^X^  +  B2X2 
where        y    =  predicted  V  score 

Bq  =  intercept 

B^  =  slope 

Xj^  =  W  score 

B^  =  shift 

X^  =  0  if  Form  V  conventional 
=  1  if  Form  V  computerized 

The  relevance  of  potentially  related  factors  was 
investigated  by  multiple  linear  regression.     The  responses 
on  the  self-reported  questionnaire  and  the  score  on  the  Form 
W  were  the  regressor  variables.     This  test  applied  only  to 
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the  experimental  group  and  was  performed  for  each  of  the 
subtests.     The  regression  model  used  was: 


y 

= 

predicted  V  score 

intercept 

slope 

^1 

W  score 

^2 

academic  track 

^3 

attitude  toward  computers 

preference  for  machines 

""s 

computer  experience 

^6 

typing  experience 

^7 

W  administered  first/second 

Summary 

Chapter  III  presented  the  approach  to  the  study  and 
justification  for  its  selection.     Relevant  variables  were 
delineated  and  detailed  descriptions  of  instrumentation  and 
population  were  provided.     The  chapter  concluded  with  a 
description  of  data  collection  and  data  analysis  procedures. 


CHAPTER  IV 
RESULTS 

This  chapter  provides  pertinent  demographic  details  on 
the  sample  population  and  the  statistical  analyses  of  test 
scores.     Descriptive  statistics  used  to  characterize  the 
sample  group  of  40  freshman  students  are  provided. 

Demographic  Information  of  Sample  Population 

Descriptive  statistics  were  computed  for  the 
demographic  variables  for  both  the  control  and  experimental 
groups.     Table  4-1  provides  a  delineation  of  the 
participants  by  gender. 
Table  4-1 
Male/Female  Ratio 


Gender 

Control  Group 
n  (%) 

Experimental  Group    Total  Sample 
n   (%)  n% 

Males 

7  (35%) 

6  (30%) 

13  (32.5%) 

Females 

13  (65%) 

14  (70%) 

27  (67.5%) 

A 

delineation  of  the 

study's  participants 

by  race  for 

both  groups  appears  in  Table  4-2.     Although  the  sample 
breakdown  in  terms  of  race  generally  approximated  that  of 
the  Orange  County  high  school  population,  its  percentage  of 
white  students  was  somewhat  greater  than  that  of  the  Orange 
County  population   (See  Table  3-1) . 
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Table  4-2 


Breakdown  by  Race 


Race 


Control  Group      Experimental  Group    Total  Sample 


White 


16  (80%) 


16  (80%) 


32  (80%) 


Black 


3  (15%) 


2  (10%) 


5  (12.5%) 


Other 


1  (5%) 


2  (10%) 


3  (7.5%) 


Table  4-3  provides  a  delineation  of  the  participants  by 
academic  tracks.       Students  assigned  to  the  advanced  tracks 
have  generally  performed  in  the  7th,  8th,  and  9th  stanine 
categories  on  standardized  tests.     Students  assigned  to  the 
average  tracks  have  performed  in  the  4th,  5th,  and  6th 
stanine  categories. 


Table  4-3 

Breakdown  by  Academic  Track  Classification 


Academic  Track 

Control  Group 

Experimental  Group 

Total 

Advanced 

10 

10 

20 

Average 

10 

10 

20 

Test  Data 

The  mean  and  standard  deviation  for  each  subtest  for 
both  forms  are  provided  in  Tables  4-4  and  4-5.     Also  shown 
are  the  composite  averages  reported  by  Bennett  et  al.  (1984) 

DAT  Technical  Supplement  for  conventional  Form  V  and  W. 
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Table  4-4 

Form  V  Subtest  Results 

Control  Group    Experimental  Group  Technical 

Supplement 

Subtest        Mean      SD  Mean     SD  Mean  SD 


Verbal 

21. 

.6 

8  . 

,4 

24, 

,9 

10. 

,0 

24. 

9 

9. 

,9 

Niimerical 

22. 

.6 

9. 

,  1 

23. 

,4 

8. 

,2 

22. 

,0 

8, 

,0 

Abs  Re as 

28. 

,0 

10. 

.1 

30. 

.7 

9. 

,1 

30, 

,1 

7. 

,7 

Clerical 

54. 

,2 

14. 

.1 

40. 

.30 

7. 

,2 

43. 

,6 

11. 

.2 

Mechanical 

40, 

.3 

11, 

.8 

45, 

.9 

11. 

.6 

45. 

.4 

9. 

.0 

Space 

26, 

.3 

15, 

.3 

25, 

.4 

11, 

.8 

29, 

.3 

10, 

.4 

Spelling 

62, 

.4 

12, 

.3 

63 

.4 

13, 

.1 

60, 

.6 

15, 

.1 

Language 

22 

.5 

6 

.6 

26 

.4 

7, 

.1 

25, 

.9 

7, 

.3 
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Table  4-5 

Form  W  Subtest  Results 

Control  Group    Experimental  Group  Technical 

Supplement 


Subtest 

Mean  SD 

Mean 

SD 

i.         Cli  1 

Verbal 

20.2 

7.9 

24.1 

9.0 

23.6 

9.9 

Numerical 

22.2 

7.0 

21.3 

9.2 

20.9 

8.1 

Abs  Reas 

28.5 

10.1 

31.8 

9.4 

29.9 

8.2 

Clerical 

52.5 

13.6 

50.7 

12.8 

44.5 

11.9 

Mechanical 

39.5 

12.7 

45.5 

12.1 

44.2 

9.4 

Space 

28.2 

12.5 

24.5 

11.4 

28.3 

11.0 

Spelling 

60.3 

11.1 

63.3 

14.1 

58.9 

14.6 

Language 

24.1 

6.6 

27.2 

7.6 

25.8 

7.4 

It  should  be  noted  that  the  mean  for  each  subtest  for 
both  the  control  and  the  experimental  groups  was  within  one 
standard  deviation  of  the  mean  score  for  the  281  subjects 
reported  in  the  DAT  Technical  Supplement.     Also,  the 
standard  deviations  were  approximately  the  same. 


For  visualization  purposes,  the  data  has  been  plotted 
and  depicted  in  Figures  4-1  through  4-8.     From  observing 
these  figures,  one  notes  that  there  is  a  relative 
similarity  for  each  subtest  with  the  exception  of  the 
clerical  speed  and  accuracy  subtest  for  which  a  distinct 
shift  was  noted  between  the  Form  V  (computerized)   and  Form  W 
(conventional)   subtest  for  the  experimental  group. 
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Correlational  Analysis 

The  first  indicator  of  test  similarity  was  the  high 
correlation  between  scores  achieved  on  the  Form  V  and  Form  W 
for  both  the  control  and  experimental  groups.     This  data  is 
provided  in  Table  4-6.     This  table  also  contains  the  range 
of  correlation  reported  in  the  DAT  Technical  Supplement  and 
that  reported  by  McBride   (1986)  ,  for  comparative  purposes. 

Table  4-6 

Correlation  of  Form  V  and  Form  W  Test  Scores 

Control      Experimental      Technical  McBride 
Subtest      Group  Group  Supplement  Report 


Verbal 

.87 

.77 

.83  - 

.90 

.89 

N\imerical 

.86 

.77 

.82  - 

.89 

.87 

Abs  Reas 

.84 

.86 

.66  - 

.81 

.79 

Clerical 

.77 

.65 

.64  - 

.86 

.28 

Mechanical 

.82 

.75 

.72  - 

.80 

.79 

Space 

.72 

.72 

.67  - 

.81 

.87 

Spelling 

.82 

.83 

.77  - 

.89 

.89 

Language 

.74 

.79 

.69  - 

.87 

.85 

With  reference  to  Table  4-6,  the  correlations  between 
Forms  V  and  W  for  the  control  group  is  seen  to  be  within  the 
range,  or  higher,  than  that  presented  by  Bennett  et  al. 
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(1984)   in  the  DAT  Technical  Supplement.     This  would  indicate 
stability  of  the  test  population  with  respect  to  previously 
established  test  correlations.     The  next  observation  is  that 
the  experimental  group  also  fell  within  the  range  reported 
by  Bennett  et  al.    (1984) .     However,  the  correlations  for  the 
experimental  group  were  somewhat  lower  than  the  control 
group  for  verbal  reasoning,  numerical  ability,  clerical 
speed  and  accuracy,  and  mechanical  reasoning  subtests.  The 
final  comparison  was  between  correlations  reported  by 
McBride   (1986)   and  those  calculated  for  the  experimental 
group.     In  general,  these  correlations  were  comparable.  The 
largest  difference  was  on  the  clerical  speed  and  accuracy 
subtest  wherein  the  experimental  group  had  a  significantly 
higher  correlation  than  that  reported  by  McBride   (1986) . 

Hypothesis  _ 1 ;     There  is  not  a  significant  rank-order 
relationship  between  the  performance  of  students  on 
computerized  and  conventionally  administered  tests. 

This  hypothesis  was  tested  by  the  Spearman-Rho  rank 
order  correlation  and  was  accepted  for  seven  of  the  eight 
subtests.     The  results  are  provided  in  Table  4-7. 
Comparable  data  were  not  reported  by  either  Bennett  et  al. 
(1984)   or  McBride   (1986).     The  rank  order  was  also  compiled 
for  the  control  group  for  comparative  purposes. 


95 


Table  4-7 


Rank  Order  Correlation 


Subtest 

Control  Group 

Experimental  Group 

Verbal 

.73 

.79 

Nximerical 

.87 

.81 

Abstract  Reas 

.89 

.85 

Clerical 

.73 

.54 

Mechanical 

.81 

.82 

Space 

.74 

.76 

Spelling 

.86 

.81 

Language 

.77 

.73 

The  rank  order  correlations  for  each  of  the  subtests 

were  comparable  for  the  control  and  experimental  groups  with 

the  exception  of  the  clerical  speed  and  accuracy  subtest. 

Although  no  specific  criteria  for  rank  order  correlation  has 

been  provided  by  the  APA  (1987)   guidelines,  these  results 

would  cause  one  to  conclude  that  equivalency  had  been  not 

established  for  that  subtest. 

Analysis  of  Variance 

Hypothesis  2;     There  is  no  significant  difference 
between  the  experimental  and  control  groups  in  terms  of 
mean  scores  on  the  Form  W. 

Hypothesis  3;     There  is  no  significant  difference 
between  students  mean  scores  on  the  first 
administration  and  the  second  administration  of  Form  W 
due  to  learning  effect. 

An  analysis  of  variance  to  detect  differences  between 
groups  was  performed  for  each  of  the  subtests  of  the  Form  W. 
The  results  are  provided  in  Table  4-8.     Also  included  are 
the  results  with  respect  to  the  analysis  of  variance  for  the 
learning  effect. 
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Table  4-8 

Analysis  of  Variance  Results  for  Group  Equivalency  and 
Learning  Effect 


Subtest        Factor      Sum  ot  Squares     F-Ratio  Alpha 


Verbal 

Group 

152 . 

1 

2 

.  02 

Learning 

12. 

1 

.  16 

Error 

2703. 

8 

Numerical 

Group 

7 . 

2 

.  10 

Learning 

38 . 

0 

.55 

Error 

2493 . 

7 

Abstract 

Group 

108. 

9 

1 

.14 

Reasoning 

Learning 

136. 

9 

1 

.43 

Error 

3433. 

8 

Clerical 

Group 

32. 

4 

.19 

Learning 

324. 

9 

1 

.92 

Error 

6095 

Mechanical 

Group 

360. 

0 

2 

.38 

Learning 

122. 

5 

.81 

Error 

5449. 

4 

Space 

Group 

136. 

9 

.93 

Learning 

122. 

5 

.84 

Error 

5281. 

4 

Spelling 

Group 

90. 

0 

.58 

Learning 

305. 

5 

1 

.94 

Error 

5611. 

4 

Language 

Group 

96. 

1 

1 

.93 

Learning 

57. 

6 

1 

.16 

Error 

1793. 

0 

One  cannot  reject  the  null  hypothesis  that  there  was  no 
difference  between  groups  for  any  of  the  subtests.  The 
groups  were  evenly  matched  due  to  the  randomization  process 
used  for  assigning  individuals.     There  was  also  no 
significant  difference  between  the  first  administration  and 
the  second  administration  due  to  a  learning  effect  for  any 
of  the  subtests. 
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Hypothesis  4;     There  is  no  difference  in  mean  scores 
between  computerized  and  conventional  versions  of  Form 
V. 

Hypothesis  5;     There  is  no  difference  between  the 
advanced  and  average  students  with  regard  to  ability  on 
the  computerized  tests. 

The  above  two  hypotheses  were  examined  using  analysis 
of  variance.     Results  are  provided  for  each  subtest  in 
Table  4-09  through  4-16.  It  should  be  noted  that  in  Tables 
4-09  through  4-16  Factor  A  refers  to  groups,  Factor  B  refers 
to  academic  category,  and  Factor  C  refers  to  test  form. 

Table  4-9 

Analysis  of  Variance  Results  for  Verbal  Reasoning  Subtest 

Sum  Squares        Mean  Squares      F-Test  Ratio  Alpha 


Factor  A 

259.2 

259.2 

2.73 

Factor  B 

1980.1 

1980. 1 

20.87 

A  X  B 

4.1 

4.1 

.042 

Error 

3415.9 

94.9 

Factor  C 

24.2 

24.2 

1.55 

A  X  C 

1.8 

1.8 

.12 

B  X  C 

.05 

.05 

.003 

A  X  B  X  C 

4.0 

4.0 

.25 

Error 

560.9 

15.6 
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Table  4-10 

Analysis  of  Variance  Results  for  Numerical  Ability  Subtest 


Sum  Squares        Mean  Squares      F-Test  Ratio  Alpha 


Factor  A 

0 

0 

0 

Factor  B 

2040 

2040.2 

26  .66 

A  X  B 

28 . 8 

28.8 

.38 

Error 

2755.2 

76.5 

Factor  C 

31.3 

31.3 

2.12 

A  X  C 

14.4 

14.4 

.98 

B  X  C 

2.4 

2.4 

.17 

A  X  B  X  C 

22.0 

22.1 

1.50 

Error 

529.8 

14.7 

Table  4-11 

Analysis  of  Variance  Results  for  Abstract  Reasoning  Subtest 
Sum  Squares        Mean  Squares       F-Test  Ratio  Alpha 


Factor  A 

177.0 

177.0 

1.50 

Factor  B 

2279.1 

2279. 1 

19.37 

A  X  B 

63.0 

63.0 

.53 

Error 

4236.8 

117.1 

Factor  C 

12.0 

12.0 

.83 

A  X  C 

2.1 

2.1 

.15 

B  X  C 

19.0 

19.0 

1.31 

A  X  B  X  C 

2.8 

2.8 

.20 

Error 

520.5 

14.5 

.01 
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Table  4-12 

Analysis  of  Variance  Results  for  Clerical  Speed  and  Accuracy 
Subtest 


Svim  Squares 

Mean  Squares 

F-Test  Ratio 

Alpha 

Factor  A 

1232.5 

1232.5 

5.16 

.05 

Factor  B 

1022.5 

1022.5 

4.28 

.05 

A  X  B 

45 

45 

.19 

Error 

8291.3 

238.6 

O  U  1/  •  o 

J  D  y  .  o 

/  .  O  J 

A  X  C 

732.0 

732.0 

15.49 

.01 

B  X  C 

18.0 

18.0 

.38 

A  X  B  X  C 

7.2 

7.2 

.15 

Error 

1700.9 

47.2 

Table  4-13 

Analysis 

of  Variance 

Results  for  Mechanical  Reasoning 

Subtest 

Sum  Squares 

Mean  Squares 

F-Test  Ratio 

Alpha 

Factor  A 

672.8 

672.8 

3.06 

Factor  B 

1729.8 

1729.8 

7.86 

.01 

A  X  B 

192.2 

192.2 

.87 

Error 

7922.2 

220.1 

Factor  C 

7.2 

7.2 

.23 

A  X  C 

.8 

.8 

.02 

B  X  C 

9.8 

9.8 

.31 

A  X  B  X  C 

20 

20 

.63 

Error 

1150.2 

32.0 

Table  4-14 

Analysis  of  Variance  Results  for  Space  Relations  Subtest 


Sum  Squares 

Mean  Squares 

F 

-Test  Ratio 

Alpha 

Factor  A 

108.1 

108.1 

.56 

Factor  B 

2726.1 

2726.1 

14.24 

.01 

A  X  B 

1058.5 

1058.5 

5.53 

.05 

Error 

6892.4 

191.5 

Factor  C 

4.5 

4.5 

.11 

A  X  C 

37.8 

37.8 

•  y  -J 

B  X  C 

1.0 

1.0 

.02 

A  X  B  X  C  332.1 

332.1 

O  m  ^  J. 

Error 

1456.1 

40.4 

Table  4-15 

Analysis 

of  Variance 

Results  for  Spelling 

Subtest 

Sum  Squares 

Mean  Squares 

F' 

-Test  Ratio 

Alpha 

Factor  A 

82.0 

82.0 

.33 

Factor  B 

2194.5 

2194.5 

8.92 

.01 

A  X  B 

122.5 

122.5 

.50 

Error 

8856.3 

246.0 

Factor  C 

27.6 

27.6 

.96 

A  X  C 

19 

19 

.66 

B  X  C 

0 

0 

0 

A  X  B  X  C 

56.1 

56.1 

1.96 

Error 

1031.8 

28.7 
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Table  4-16 

Analysis  of  Variance  Results  for  Language  Usage  Subtest 


Sum  Squares 

Mean  Squares 

F-Test  Ratio  Alpha 

Factor  A 

241 .51 

241.51 

3.92 

Factor  B 

1058.51 

1508.51 

17.16  .01 

A  X  B 

17.1 

17.1 

.28 

Error 

2220.1 

61.1 

27.6 

A  X  C 

2.8 

2.8 

.24 

B  X  C 

15.3 

15.3 

1.33 

A  X  B  X  C 

4.5 

4.5 

.39 

Error 

415.3 

11.5 

As  shown  by  Tables  4-09  through  4-16,  the  academic 
track  was  significant   (at  alpha  =  .05  or  less)   for  all 
subtests.     This  was  expected  since,  in  general,  aptitude 
tests  correlate  highly  with  academic  performance.     On  the 

clerical  speed  and  accuracy  test  all  three  factors  were 
significant:     group,  academic  category,  and  test  form.  The 
significant  cross  factor  of  group-form  provides  further 
indication  that  this  particular  subtest  may  not  be 
equivalent  to  its  conventional  counterpart.     The  only  other 
significant  effects   (Alpha  =  .05)  were  a  cross  effect 
between  group  and  academic  category  and  an  interaction 
effect  among  the  group,  academic  category  and  test  form  for 
space  relations. 
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In  addition  to  the  analysis  of  variance,  it  was  deemed 
appropriate  to  examine  the  various  cell  means  to  determine 
wherein  significant  differences  exist.     Table  4-17  contains 
the  cell  means  for  the  advanced  and  average  students  for 
both  the  control  and  experimental  subgroups.     Table  4-18 
contains  the  results  of  Dunn's  test,  being  applied  to  the 
various  cell  means. 

Table  4-17 

Cell  Means  for  Advanced  and  Average  Students 


Control  Experimental 
Form  V  Form  W  Form  V  Form  W 


S\±)test 

Adv 

Avg 

Adv 

Adv 

Avg 

Adv 

Avg 

Verbal 

26.5 

16.6 

24.7 

15. 

6 

29 

.8 

19.9 

29.5 

18.6 

Numerical 

28.7 

16.2 

27.1 

17. 

2 

27 

.5 

19.3 

26.1 

16.5 

Abs  Reas 

34.9 

21.1 

34.0 

22. 

9 

35 

.9 

25.9 

35.9 

27.6 

Clerical 

59.3 

49.1 

56.0 

48. 

7 

43 

.3 

37.3 

53.3 

48.0 

Mechanical 

47.3 

33.2 

48.8 

34. 

1 

48 

.8 

42.9 

48.7 

42.2 

Space 

37.7 

14.9 

35.7 

20. 

6 

25 

.4 

25.3 

28.8 

20.1 

Spelling 

69.7 

55.1 

65.7 

54. 

6 

66 

.6 

60.3 

68.1 

58.4 

Language 

26.4 

18.7 

26.6 

21. 

6 

30 

.7 

22.1 

31.1 

23.3 
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Table  4-18 

Dunn's  Test  Results 


Comparison          VR  NA 

AR  CSA 

MR  SR 

SP 

LU 

Ad    Control  VW 

Av    Control  VW 

0 

Ad    Exp  VW 

X 

Av     Exp  VW 

X 

0 

Ad     Cont/Exp  VV 

X 

Ad     Cont/Exp  WW 

X 

Av    Cont/Exp  VV 

X 

X  X 

0 

0 

Av    Cont/Exp  WW 

X 

x  implies  significant  .05;  0  implies  significant  at  .25 
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Dunn's  Test  Results 

One  can  reject  the  hypothesis  that  there  is  no 

difference  between  the  computer-administered  and 

conventionally  administered  test  for  the  clerical  speed  and 

accuracy  subtest.     The  difference  was  evidenced  in  both  the 

high  academic  achievers  as  well  as  the  average  achievers. 

Of  particular  note,  were  the  computerized  Form  V  and 

conventional  Form  V  comparisons  for  the  average  achievers. 

Significant  differences  were  found  for  three  of  the  subtests 

and  detectable  differences  for  two  others.     Based  on  these 

results,  equivalency  was  not  established  for  the  clerical 

speed  and  accuracy  subtest.       Also  one  would  question  about 

whether  a  bias  exists  in  several  of  the  subtests  for  the 

average  academic  achievers. 

Hypothesis  6;     There  are  no  differences  in  the  shapes 
of  the  distributions  of  scores  for  computerized  and 
conventional  versions  of  the  test  variable. 

Goodness-of-Fit 

A  Komogorov-Smirnov  goodness-of-f it  test  was  performed 

on  the  Form  V  scores  with  the  cumulative  distribution  of 

control  group  scores  being  the  basis  for  comparison  for  the 

experimental  group.  This  particular  test  was  selected  since 

it  did  not  require  some  a  priori  knowledge  of  the  shape  of 

the  distribution   (Lindgren,   1968) .     The  results  of  the  test 

are  provided  in  Table  4-19  and  individual  graphs  of  the 


105 

cximulative  distribution  functions  are  provided  in  Appendix 
C. 


Table  4-19 

Goodness-of-Fit  Results 


Subtest 


Alpha 


Remarks 


Verbal 
Reasoning 

Numerical 
Ability 

Abstract 
Reasoning 

Clerical 
Speed 

Mechanical 
Reasoning 

Space 
Relations 

Spelling 

Language 
Usage 


.15 
.15 
.01 
.10 


.05 


experimental  scores 
truncated 

similar  shape;  some  shift 


similar  shape;  shift 
detected 


The  results  in  the  above  table  would  lead  one  to 
conclude  that,  with  the  exception  of  the  clerical  speed  and 
accuracy  subtest,  the  shapes  of  the  distributions  of  scores 
are  similar  for  both  the  computer-administered  and  the 
conventional  versions.     For  the  clerical  speed  and  accuracy 
subtest,  the  results  were  statistically  different  (alpha 
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.01) .     An  examination  of  the  cumulative  distribution 
function  in  Appendix  C  would  lead  one  to  conclude  that  the 
scores  achieved  on  the  conventional  version  were  truncated 
and  the  shape  of  the  distribution  of  scores  was  distinctly 
different  between  the  two  versions. 

The  plots  of  the  mechanical  reasoning  and  language 
usage  subtests  scores  were  similar  in  shape.     There  were, 
however,  upward  shifts  that  were  detected  for  the  scores 
achieved  on  the  computer-assisted  versions  of  the 
Form  V. 

Regression  Analysis 

Hypothesis  7;     There  are  no  significant  differences  in 
subtest  normative  scores  factors  for  computerized 
versions  compared  to  conventional  versions. 

Regression  anaylses  were  performed  to  ascertain  whether 

the  normative  factors  were  equivalent  between  the  two 

versions  ot  Form  V.     Table  4-20  provides  the  results  of  the 

regression  analysis  where  the  type  of  test  administration 

(conventional  or  computerized)  was  treated  as  a  categorical 

variable.     The  R  ,  or  coefficient  of  determination,  is  a 

measure  of  the  general  merit  of  the  regression  and  is  the 

ratio  of  the  sum  ot  squares  attributable  to  the  regression 

to  the  total  sum  of  squares   (Meyers,   1986).     The  impact  of 

the  removal  of  the  categorical  variable  yields  the  effect  on 

the  coefficient  of  determination  of  not  knowing  which 

version  of  the  Form  V  was  taken  by  the  respondent. 
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Table  4-20 

Regression  for  Impact  on  Subtest  Norms 


Subtest 

®0 

^2 

r2 

y 

R    After  Removal 
of 

Verbal 

3.62 

.89 

-.17 

.67 

.67 

Numerical 

3.89 

.89 

1.57 

.63 

.62 

Abs  Reas 

4.22 

.89 

-.11 

.72 

.72 

Clerical 

22.74 

.60 

-12.82 

.64 

cannot  remove 

Mechanical 

10.90 

.74 

1.19 

.64 

.64 

Space 

3.16 

.82 

2.09 

.52 

.51 

Spelling 

12.65 

.83 

-1.42 

.68 

.67 

Language 

4.68 

.74 

1.55 

.62 

.61 

With  the  exception  of  the  clerical  speed  and  accuracy 
subtest,  the  quality  of  the  regression  was  not  substantially 
altered  by  removal  of  the  categorical  variable.     Some  small 
adjustment  on  the  order  of  1  to  2  points  in  scaling  might  be 
in  order.     For  the  clerical  speed  and  accuracy  subtest, 
knowledge  of  which  type  of  administration  of  the  Form  V  was 
important  in  the  regression  and  an  adjustment  of  12.8  points 
in  the  norms  was  evident.    (Note:  Based  on  the  results 
previously  reported  for  the  clerical  speed  and  accuracy 
subtest,  one  would  not  necessarily  wish  to  change  the  norms 
but  rather  examine  other  solutions  such  as  revision  of  the 
computerized  test.) 
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Hypothesis  8;     There  are  no  significant  influences  on 
computerized  test  performance  that  can  be  attributed  to 
attitude  toward  computers,  preference  for  dealing  with 
machines,  previous  computer  experience,  and  previous 
typing  experience. 

To  examine  factors  that  might  be  related  to  performance 

on  the  computer-administered  version,  a  complete  regression 

model  was  formed.     This  included  factors  for:     advanced  or 

average  academic  achievement;  attitude  toward  computers; 

preference  for  dealing  with  machines  versus  people;  previous 

computer  experience;  typing  experience;  and  which  form  was 

2 

taken  first.     The  results  are  shown  in  Table  4-21.     The  R  ^ 

2 

denotes  coefficient  of  determination  from  Table  4-20  and  R  „ 

F 

2 

denotes  when  the  indicated  factors  are  included.     The  R  ^ 

was  computed  based  on  the  entire  sample  population  whereas 
2 

the  R  p  was  computed  for  the  experimental  group. 

The  quality  of  the  regression,  as  expressed  for  the 
coefficient  of  determination,  was  significantly  improved 
when  considering  the  full  model  for  numerial  ability, 
abstract  reasoning,  space  relations,  spelling,  and  language 
usage.     The  dominant  factor  in  all  cases,  except  for  the 
clerical  speed  and  accuracy  subtest,  was  the  score  on  the 
Form  W.     Previous  computer  experience  was  a  factor  in  all  of 
eight  subtests,  being  a  moderately  strong  factor  in  two  of 
the  eight.     Preferences  for  machines  was  third  in  overall 
impact,  being  a  factor  in  seven  of  the  subtests.  Typing 
experience  did  not  contribute  significantly.  The 
coefficient  of  determination  for  the  clerical  speed  and 
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accuracy  subtest  dropped  slightly  from  the  initial 
regression,  which  included  the  total  population.     In  the 
case  of  this  subtest,  one's  academic  category  was  the  major 
factor  in  the  regression. 


CHAPTER  V 

SUMMARY,   CONCLUSIONS,   DISCUSSION,   AND  RECOMMENDATIONS 

Summary 

Restatement  of  the  Problem 

Although  numerous  assessment  instruments  have  undergone 
computer  conversion  in  the  last  several  years,  there  is  very 
little  research  data  to  justify  their  use.  Few  studies  have 
been  conducted  wherein  the  researcher  has  specifically 
investigated  whether  equivalency  exists  between  computerized 
versions  and  their  conventional  counterparts  (Honaker  & 
Harrell,  1987) . 

The  main  purpose  of  this  study  was  to  investigate 
whether  the  computerized  adaptive  version  of  the  DAT  was 
equivalent  to  the  conventional  version  of  the  DAT.  The 
criteria  used  to  determine  equivalency  were  those 
established  by  the  APA  (1987) .     A  secondary  purpose  was  to 
establish  the  degree  that  ancillary  factors   (e.g.,  computing 
experience)  were  associated  with  performance  on  the 
computerized  version. 
Restatement  of  the  Methodology 

The  research  method  used  to  investigate  the  equivalency 
issue  was  a  true  experimental  design  that  involved 
establishing  a  control  group  and  an  experimental  group.  A 
matched  group  sampling  procedure  was  employed  to  establish 
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equivalent  groups.     Testing  involved  administration  of  Form 
W  of  the  DAT  to  both  groups  to  verify  relative  equivalency 
of  the  groups.     Administration  of  the  conventional  Form  V  to 
the  control  group  and  computerized  Form  V  to  the 
experimental  group  provided  insight  with  respect  to  the 
effect  on  test  scores  resulting  from  computerization  of  the 
test  form.     Learning  effects  were  counterbalanced  by 
alternating  the  administration  of  the  two  test  forms. 

The  statistical  tests  used  to  investigate  the 
equivalency  issues  were:  the  Spearman-Rho  rank  order 
correlation;  analysis  of  variance;  Dunn's  test; 
Kolmogorov-Smirnov  goodness-of-f it  test;  and  linear 
regression.     Linear  regression  analysis  was  also  performed 
to  investigate  the  impact  of  anxillary  factors  on  test 
performance . 

Results 

The  computerized  adaptive  version  of  the  DAT  was  found 
to  be  equivalent  to  the  conventional  version  of  the  DAT  for 
7  of  the  8  subtests.     This  equivalency  was  established  based 
on  the  APA  (1987)   criteria  of  comparable  mean  scores, 
dispersions,  comparable  shapes  of  distributions  of  the 
scores,  and  a  similar  rank  ordering  of  individual  scores. 
The  subtest  for  which  equivalency  was  not  established  was 
the  clerical  speed  and  accuracy  subtest. 

A  further  procedure,  which  was  invoked  to  confirm  the 
equivalency,  was  the  calculation  and  analyses  of  standard 
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correlations  between  the  Form  V  and  Form  W  for  both  groups. 
The  observed  correlation  coefficients  compared  closely  with 
those  reported  by  Bennett  et  al.    (1984)  ,  for  conventional 
Form  V  and  conventional  Form  W.       In  addition  to  the 
comparability  of  correlation  coefficients,  the  standard 
deviations  were  determined  to  be  comparable  to  those 
reported  by  Bennett  et  al.    (1984) . 

Despite  the  overall  comparability  between  versions, 
there  were  some  significant  differences  on  several  subtests 
between  the  conventional  Form  V  and  the  computerized  Form  V 
for  the  average  students.     This  raises  the  possibility  that, 
while  the  results  in  the  aggregate  may  be  equivalent,  some 
built  in  biases  might  exist.     Biases  could  stem  from  the 
very  nature  of  the  computerized  testing  experience,  in 
general,  or  from  the  specific  adaptive  features  of  this 
test;  either  of  which  could  result  in  an  inaccurate 
reflection  of  an  individual's  aptitudes.     This  bias  was  not 
detected,  however,  in  the  verbal  reasoning  or  numerical 
ability  subtests,  which  Bennett  et  al.    (1984)   assert  are  the 
most  important  tests  for  assessing  one's  general  aptitude 
level . 

In  the  overall  sense,  the  norms  appeared  to  be  valid. 
They  were  found  to  be  within  1  or  2  points  for  seven  of  the 
eight  subtests.     The  norms  for  the  clerical  speed  and 
accuracy  subtest,  however,  appear  to  require  substantial 
modification.     Revision  of  the  norms  of  magnitude  indicated 


114 

in  the  regression  analysis  would  cause  one  to  conclude  that 
it  might  be  preferable  to  devise  an  alternative  format  for 
this  subtest. 

In  the  investigation  of  factors,  which  were  considered 
to  be  potentially  related  to  performance  on  the  computerized 
Form  V,  prior  computer  experience  was  strongly  related  to 
performance  in  two  subtests  and  weakly  related  to 
performance  in  all  the  other  subtests.     Another  factor 
related  to  performance  was  the  preference  for  working  with 
machines;  this  factor  was  shown  to  be  weakly  related  to 
performance  in  seven  of  the  eight  subtests.  Although 
counterintuitive,  the  analysis  indicated  that  typing 
experience  did  not  play  a  role  in  the  clerical  speed  and 
accuracy  subtest.     Attitude  toward  computers  was  shown  to  be 
a  significant  factor  in  the  numerical  ability  subtest.  This 
is  consistent  with  one's  intuition  that  those  who  feel 
positively  toward  computers  would  be  expected  to  have  a 
higher  aptitude  in  the  numerical  area. 

Conclusions 

As  a  result  of  the  study,  it  was  concluded  that 
computerized  assessment  of  one's  aptitudes  was,  in  general, 
a  suitable  method  of  test  administration  for  the  students  in 
the  sample.     The  sample  population  for  this  research 
consisted  of  volunteers  from  the  Orange  County  public  school 
system  and  appeared,   from  the  descriptive  statistics,  to 
represent  a  cross  section  of  the  county's  freshman  class 
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population.     The  scores  of  the  students  of  the  Orange  County 
school  district  were  similar  to  the  national  average  scores 
on  the  SAT.     Although  the  results  cannot  necessarily  be 
extrapolated  to  other  school  districts  without  further 
analyses,  there  is  no  reason  to  believe  that  the  findings 
would  not  be  similar  in  other  areas  by  virtue  of  the  average 
aspects  of  Orange  County  and  the  cross  section  of  students 
who  participated  in  this  experiment. 

Despite  the  finding  that  computerized  testing  is 
generally  a  suitable  assessment  procedure,  the  claims  made 
by  Loftus  and  Loftus   (1983)   that  certain  groups  are 
selectively  favored  or  disfavored  by  computerized  testing 
were  substantiated.     Specifically,  the  computing  experience 
factor,  which  was  identified  by  Burkhead  and  Sampson   (1985) , 
was  found  to  correlate  with  performance  on  the  computerized 
version.     Also,  the  finding  of  Hunt  and  Pelligrino  (1984) 
that  attitude  toward  computers  influences  performance  on 
computerized  tests,  was  confirmed  by  this  study. 

The  results  of  this  study  also  lead  one  to  conclude 
that  not  all  types  of  conventional  tests  may  be  candidates 
for  computer  conversion.     While  many  of  the  aptitude  tests 
incurred  only  minor  changes  as  a  result  of  the  conversion 
process  and,  therefore,  no  significant  losses  in 
reliability,     the  clerical  speed  and  accuracy  subtest,  when 
converted,  incurred  significant  changes  and  concomitant 
losses  in  reliability.     In  conclusion,  while  it  appears  that 
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the  computerized  test  is  a  viable  assessment  method, 
additional  reliability  studies  are  necessary  to  delineate 
those  particular  tests  that  can  and  those  that  cannot  be 
converted  to  computer  administration  without  a  reduction  in 
reliability. 

Discussion 

This  experiment,  in  general,  confirmed  the  suitability 
of  computerized  testing  for  assessing  one's  aptitudes.  The 
advantages  which  accrue  from  computerized  testing  were 
demonstrated  by  the  use  of  the  computerized  Form  V.  One 
obvious  advantage  in  using  the  computerized  version  is  the 
decrease  in  the  overall  testing  time  requirement;  this 
feature  allows  the  recipient  more  time  to  devote  to  other 
academic  activities.     Another  advantage  accrues  from  the 
automatic  scoring  and  report  generation  options  available 
with  the  computerized  form.     When  a  large  number  of  students 
are  involved,  this  aspect  can  substantially  reduce  the 
administrative  workload.     The  computerized  adaptive  version 
of  the  DAT  has  the  concomitant  advantage  of  automatically 
generating  a  Career  Planning  Program   (CPP)   report.     The  CPP 
report  provides  relevant  information  for  career  development 
by  combining  DAT  test  scores  with  personal  data   (Tolbert,  1980). 

The  clerical  speed  and  accuracy  test,  as  currently 
configured,  does  not  appear  to  adequately  measure  ones 
clerical  skills.     One  would  assume  that  a  computer,  with  its 
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inherent  precision,  would  be  an  ideal  instrument  for 
measuring  clerical  aptitude.     One  would  also  have  expected 
to  find  a  high  degree  of  correlation  between  typing 
experience  and  performance  in  the  clerical  speed  and 
accuracy  subtest  but  the  results  of  the  experiment  do  not 
support  this  expectation.     This  would  suggest  that 
alternative  ways,  using  computer  technology,  should  be 
devised,  instrumented,  and  evaluated  to  measure  this  skill. 

The  adaptive  aspects  of  the  computerized  form,  while 
desirable  when  considering  the  reduced  testing  time,  incurs 
some  risk  of  introducing  bias  in  the  process  of  estimating 
performance  and  selecting  questions  in  accordance  with  those 
estimates.     This  bias  was  evident  to  some  degree  on  several 
subtests  for  the  average  academic  achievers.  Further 
investigation  appears  to  be  in  order  regarding  how  one's 
ongoing  performance  should  relate  to  the  selection  of 
subsequent  test  questions  and  to  the  scoring  rules  of  one's 
responses. 

The  final  discussion  point  relates  to  prior  computing 
experience  and  the  influence  it  appears  to  have  on  overall 
test  scores.     Since  this  factor  was  pervasive  across  all 
subtests,  dealing  with  this  issue  raises  a  dilemna.     On  the 
one  hand,  it  could  be  argued  that  the  norms  for  the  test 
should  be  altered  for  those  with  computing  experience  since 
they  appear  to  have  an  advantage  going  into  the  test. 
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Conversely,  it  might  be  argued  that  because  computers  are 
becoming  so  prevalent  in  such  a  wide  variety  of  environments 
(e.g.,  academic,  business,  and  home  life)   that  computing 
ability  should  be  factored  into  the  assessment  process.  If 
one  wanted  an  accurate  measure  of  one's  aptitude  in  our 
evolving  society,  then  the  bias  injected  by  prior  computing 
experience  may  be  justified  in  such  an  assessment. 

Recommendations 
The  results  of  this  study  manifest  a  need  for  future 
research.     The  following  recommendations  are  made  as 
objectives  for  further  study: 

1.  Development  of  an  improved  computer-assisted  test  for 
measuring  clerical  ability  is  warranted.     After  equivalency 
is  established,  the  modified  subtest  should  replace  the 
current  clerical  speed  and  accuracy  subtest  within  the 
computerized  DAT  battery. 

2.  Studies  that  specifically  examine  the  equivalency  of 
all  computerized  tests  should  be  conducted  on  a  test-by-test 
basis  and  a  conversion-by-conversion  basis.     The  results 
from  the  studies  would  need  to  be  validated  with  respect  to 
various  populations  to  permit  intelligent  interpretation  of 
test  results. 

3.  Research  oriented  toward  the  identification  of 
additional  personal  characteristics  that  selectively 


favor  or  disfavor  individuals  for  computerized  testing 
should  be  conducted. 

4.  Classification  of  the  types  of  tests  that  can  and 
cannot  undergo  a  straight  conversion  process  without 
incurring  changes  to  the  nature  of  the  test  should  be 
determined. 

5.  A  research  project  oriented  on  identification  of 
technological  features  that  could  enhance  the  conversion 
effort  for  those  tests  that  cannot  easily  be  converted 
should  be  undertaken. 

6.  An  investigation  to  delineate  what  changes  in  scoring 
norms  are  attributable  to  the  computerization  process  and 
what  changes  are  due  to  the  adaptive  features 

should  be  conducted. 

7.  An  investigation  to  identify  the  optimum  response  mode 
options   (e.g.,  light  pen),   for  particular  types  of 
computerized  tests  must  be  conducted. 

8.  Classification  of  the  various  adaptive  test  models  by 
their  effectiveness  in  measuring  the  various  skills  and 
abilities  must  be  established. 

9.  Research  to  establish  how  computerized  methods  can  be 
successfully  employed  in  order  to  measure  abilities  and 
traits  currently  immeasurable  by  conventional  means  needs  to 
be  conducted. 
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10.     A  study  project  to  establish  ways  in  which  computerized 
test  methods  can  be  used  to  improve  upon  current  test 
methods  and  extricate  some  of  the  cultural  bias  currently 
embedded  in  conventional  test  methods  needs  to  be  conducted. 


i^jpendix  A 
CONSENT  FORM 


Consent  Form 


I  am  a  graduate  student  in  the  Counselor  Education  department  at 
the  University  of  Florida.    As  part  of      dissertation  research,  I  need 
to  gather  information  on  v^ether  the  nature  of  standardized  aptitude 
tests  is  significantly  altered  when  administration  is  conducted  by 
corrputer.    In  order  to  collect  this  information,  I  will  need  to 
administer  the  conventional  versions  of  the  test  to  40  students.  In 
addition,  I  will  need  for  20  of  those  students  to  take  a  coiputerized 
version  of  the  test.    In  all  cases,  I  will  be  the  only  individual  either 
conducting  or  supervising  the  testing  process.    At  the  conclusion  of  the 
testing,  all  students  will  be  asked  to  complete  a  brief  questionnaire 
that  will  assess  attitudes  toward  and  experiences  with  catputers. 

Testing  will  be  conducted  inmediately  after  school  in  a  designated 
classrocm.    Each  testing  session  will  take  approximately  one  hour. 
There  will  be  approximately  5  to  7  hours  of  testing  in  total.  All 
students  will  be  given  an  explanation  of  the  nature  of  the  project  and 
then  clear  directions  of  v^iat  is  expected  during  the  testing  process. 
Code  n\anbers  will  be  assigned  to  each  participant;  students'  names  will 
NOT  be  used  on  any  test  or  survey  data.    All  students  (as  well  as 
parents)  will  be  given  the  opportunity  to  withdraw  permission  to 
participate  in  the  study  at  any  time  without  any  penalty  or  prejudice. 
All  stxjdents  will  be  given  the  opportunity  to  make  an  appointment  with 
the  researcher  to  discuss  test  results  following  the  project. 

There  are  no  risks  associated  with  the  testing  effort. 
Participation  or  non-participation  in  this  study  will  not  affect 
students'  grades  in  any  class.    Any  questions  concerning  the  natiore  of 
the  study  or  this  consent  may  be  directed  to  me,  Kathleen  Douglas,  at 
(407)  628-9571. 


I  have  read  and  I  understand  the  procedure  described  above.    I  agree  to 

allow  my  child,   ,  to  participate  in  Ms.  Douglas'  study 

and  I  have  received  a  copy  of  the  description.         -  ^ 
SIGNATURES: 


Parent/Guardian  Date        2nd  Parent/Witness  Date 


Student  Date        Principal  Investigator  Date 
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i^pendix  B 
QUESTIONNAIRE 


Questionnaire 


1.  I  enjoy  interacting  with  cotputers.  Strongly  Agree  5 

Agree  4 

Undecided  3 

Disagree  2 

Strongly  Disagree  1 

2.  In  general,  I  prefer  working  with                  Strongly  Agree  5 
machines  itiore  than  people.                             Agree  4 

Undecided  3 

Disagree  2 

Strongly  Disagree  1 

3.     Have  you  ever  worked  with  computers  YES  NO 

before? 


had? 
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Questionnaire  (Continued) 


Questions 


Responses 


5.    Have  you  had  any  typing  experience? 


YES 


NO 


6.    Hew  much  typing  experience  have  you 
had? 


MCNTHS 


Appendix  C 
GOOIXJESS-OF-FIT  GRAPHS 
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